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(57) Abstract: This invention relates to a screening method for the identification of agents which modulate the activity of a DNA 
replication protein as a target for intervention in cancer therapy and includes agents which modulate said activity. The. invention also 
relates to the use of the DNA replication protein, and its RNA transcripts in the prognosis and diagnosis of proliferative disease e.g.. 
cancer. 
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was isolated from a human medulloblastoma derived cDNA library using an in vivo 
tumorigenesis model (Warder and Keherly, 2003). Our analysis shows for the first time 
that Cizl plays a positive role in initiation of DNA replication. 

5 A number of changes to chromatin bound proteins occur when DNA synthesis is 
activated in vitro by recombinant cyclin A-cdk2. The present invention relates to the 
finding that a cdc6-related antigen, p85, correlates with the initiation of DNA replication 
and is regulated by cyclin A-cdk2. The protein was cloned from a mouse embryo library 
and identified as mouse Cizl . 

10 

In vitro analysis has shown that Cizl protein positively regulates initiation of DNA 
replication and that its activity is modulated by cdk phosphorylation at threonine 191/2, 
linking it to the cdk-dependent pathways that control initiation. The Embryonic form 
mouse Cizl is alternately spliced, compared to predicted and somatic forms. Human 

15 Cizl is also alternately spliced, with variability in the same exons as mouse Cizl. It has 
been found that recombinant embryonic form Cizl promotes initiation of mammalian 
DNA replication and that pediatric cancers express 'embryonic-like' forms of Cizl. 
Without wishing to be held to one theory, the inventors propose that Cizl mis-splicing 
produces embryonic-like forms of Cizl at inappropriate times in development. This 

20 promotes inappropriately regulated DNA replication and contributes to formation, or 
progression of cancer cell lineages. 
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A number of techniques have been developed in recent years which purport to 
specifically ablate genes and/or gene products. For example, the use of anti-sense 
nucleic acid molecules to bind to and thereby block or inactivate target mRNA 
molecules is an effective means to inhibit the production of gene products. 

5 

A much more recent technique to specifically ablate gene function is through the 
introduction of double stranded RNA, also referred to as inhibitory RNA (RNAi), into a 
cell which results in the destruction of mRNA complementary to the sequence included 
in the RNAi molecule. The RNAi molecule comprises two complementary strands of 
10 RNA (a sense strand and an antisense strand) annealed to each other to form a double 
stranded RNA molecule. The RNAi molecule is typically derived from the exonic or 
coding sequence of the gene which is to be ablated. 

Nucleic acids and proteins have both a linear sequence structure, as defined by their base 
15 or amino acid sequence, and also a three dimensional structure which in part is 
determined by . the linear sequence and also the environment in which these molecules 
are located. Conventional therapeutic molecules are small molecules, for example, 
peptides, polypeptides, or antibodies, which bind target molecules to produce an 
agonistic or antagonistic effect. It has become apparent that nucleic acid molecules also 
20 have potential with respect to providing agents with the requisite binding properties 
which may have therapeutic utility. These nucleic acid molecules are typically referred 
to as aptamers. Aptamers are small, usually stabilised, nucleic acid molecules which 
comprise a binding domain for a target molecule. 
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Aptamers may comprise at least one modified nucleotide base. The term "modified 
nucleotide base" encompasses nucleotides with a covalently modified base and/or sugar. 
For example, modified nucleotides include nucleotides having sugars which are 
5 covalently attached to low molecular weight organic groups other than a hydroxyl group 
at the 3 ! position and other than a phosphate group at the 5 f position. Thus modified 
nucleotides may also include 2' substituted sugars such as 2'-0-methyl-; 2-O-alkyl; 2-0- 
allyl; 2'-S-alkyl; 2'-S-allyl; 2'- fluoro-; 2'-halo or 2;azido-ribose, carbocyclic sugar 
analogues a-anomeric sugars; epimeric sugars such as arabinose, xyloses or lyxoses, 
10 pyranose sugars, furanose sugars, and sedoheptulose. 



Modified nucleotides are known in the art and include by example and not by way of 
limitation; alkylated purines and/or pyrimidines; acylated purines and/or pyrimidines; 
or other heterocycles. These classes of pyrimidines and purines are known in the art and 

15 include, pseudoisocytosine; N4, N4-ethanocytosine; 8-hydroxy-N6-methyladenine; 4- 
acetylcytosine, 5-(carboxyhydroxylmethyl) uracil; 5-fluorouracil; 5-bromouracil; 5- 
carboxymethylaminomethyl-2-thiouracil; 5-carboxymethylaminomethyl uracil; 
dihydrouracil; inosine; N6-isopentyl-adenine; 1-methyladenine; 1-methylpseudouracil; 1- 
methylguanine; 2,2-dimethylguanine; 2-methyladenine; 2-methylguanine; 3- 

20 methylcytosine; 5-methylcytosine; N6-methyladenine; 7-methylguanine; 5- 
methylaminomethyl uracil; 5-methoxy amino methyl-2-thiouracil; p-D- 
mannosylqueosine; 5-methoxycarbonylmethyluracil; 5-methoxyuracil; 2 methylthio-N6- 
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isopentenyladenine; uracil-5-oxyacetic acid methyl ester; psueouracil; 2-thiocytosine; 5- 
metliyl-2 thiouracili 2-thiouracil; 4-thiouracil; 5-methyluracil; N-uracil-5-oxyacetic acid 
methylester; uracil 5— oxyacetic acid; queosine; 2-thiocytosine; 5-propyluracil; 5- 
propylcytosine; 5-ethyluracil; 5-ethylcytosine; 5-butyluracil; 5-pentyluracil; 5- 
5 pentylcytosine; and 2,6,-diaminopurine; methylpsuedouracil; 1-methylguanine; 1- 
methylcytosine; 

Aptamers may be synthesized using conventional phosphodiester linked nucleotides 
using standard solid or solution phase synthesis techniques which are known in the art. 
10 Linkages between nucleotides may use alternative linking molecules. For example, 
linking groups of the formula P(0)S, (thioate); P(S)S, (dithioate); P(0)NR'2; P(0)R'; 
P(0)OR6; CO; or CONR'2 wherein R is H (or a salt) or alkyl (1-12C) and R6 is alkyl 
(1-9C) is joined to adjacent nucleotides through -O- or -S-. 

15 Other techniques which purport to specifically ablate genes and/or gene products focus 
on modulating the function or interfering with the activity of protein molecules. 
Proteins can be targeted by chemical inhibitors drawn, for example, from existing small 
molecule libraries. 

20 Antibodies, preferably monoclonal, can be raised for example in mice or rats against 
different protein isoforms. Antibodies, also known as immunoglobulins, are protein 
molecules which have specificity for foreign molecules (antigens). Immunoglobulins 
(Ig) are a class of structurally related proteins consisting of two pairs of polypeptide 
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chains, one pair of light (L) (low molecular weight) chain (k or X), and one pair of heavy 
(H) chains (y, a, jx, 5 and s), all four linked together by disulphide bonds. Both H and L 
chains have regions that contribute to the binding of antigen and that are highly variable 
from one Ig molecule to another. In addition, H and L chains contain regions that are 
5 non- variable or constant. 

The L chains consist of two domains. The carbbxy-terminal domain is essentially 
identical among L chains of a given type and is referred to as the "constant" (C) region. 
The amino terminal domain varies from one L chain to anther and contributes to the 
10 binding site of the antibody. Because of its variability, it is referred to as the "variable" 
(V) region. 

The H chains of Ig molecules are of several classes, a, ji, a, a, and y (of which there are 
several sub-classes). An assembled Ig molecule consisting of one or more units of two 
15 identical H and L chains, derives its name from the H chain that it possesses. Thus, 
there are five Ig isotypes: IgA, IgM, IgD, IgE and IgG (with four sub-classes biased on 
the differences in the H chains, i.e., IgGl, IgG2, IgG3 and IgG4). Further detail 
regarding antibody structure and their various functions can be found in, Using 
Antibodies: A laboratory manual, Cold Spring Harbour Laboratory Press. 

20 

Chimeric antibodies are recombinant antibodies in which all of the V-regions of a mouse 
or rat antibody are combined with human antibody C-regions. Humanised antibodies are 
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recombinant hybrid antibodies which fuse the complimentarity determining regions from 
a rodent antibody V-regibn with the framework regions from the human antibody V-? 
regions". The C-regions from die human antibody are also used. The complimentarity 
determining regions (CDRs) are the regions within the N-terminal domain of both the 
5 heavy and light chain of the antibody to where the majority, of the variation of the V- 
region is restricted. These regions form loops at the surface of the antibody molecule. 
These loops provide the binding surface between the antibody and antigen. 

Antibodies from non-human animals provoke an immune response to the foreign 
10 antibody and its removal from the circulation. Both chimeric and humanised antibodies 
have reduced antigenicity when injected to a human subject because there is a reduced 
amount of rodent (i.e. foreign) antibody within the recombinant hybrid antibody, while 
the human antibody regions do not illicit an immune response. This results in a weaker 
immune response and a decrease in the clearance of the antibody. This is clearly 
15 desirable when using therapeutic antibodies in the treatment of human diseases. 
Humanised antibodies are designed to have less "foreign" antibody regions and are 
therefore thought to be less immunogenic than chimeric antibodies. 

Other techniques for targetting at the protein level include the use of randomly generated 
20 peptides that specifically bind to proteins, and any other molecules which bind to 
proteins or protein variants and modify the function thereof. 
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Understanding the DNA replication process is of prime concern in the field of canceT 
therapy. It is known that cancer cells can become resistant to chemotherapeutic agents 
and can evade detection by the immune system. There is an on going need to identify 
targets for cancer therapy so that new agents can be identified. The DNA replication 
process represents a prime target for drug intervention in cancer therapy. There is a 
need to identify gene products which modulate DNA replication and which contribute to 
formation or progression of cancer cell lineages, and to develop agents that affect their 
function. 

Statements of the invention 

According to one aspect of the present invention there is provided the use of a Cizl 
nucleotide or polypeptide sequence, or any fragment or variant thereof, as a target for 
the identification of agents which modulate DNA replication. 

As used herein the term 'fragment' or Variant' is used to refer to any nucleic or amino 
acid sequence which is derived from the full length nucleotide or amino acid sequence 
of Cizl or derived from a splice variant thereof. In one embodiment of the invention the 
fragment is of sufficient length and/or of sufficient homology to full length Cizl to 
retain the DNA replication activity of Cizl. In an alternative embodiment inactive Cizl 
fragments are used. The term 'fragment' or 'variant' also relates to the Cizl RNA 
transcripts described herein and protein isoforms (or parts thereof). 
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As used herein the term 'modulate' is used to refer to either increasing or decreasing 
DNA replication, above and below the levels which would normally be observed in the 
absence of the specific agent (i.e., any alterations in DNA replication activity which are 
either directly or indirectly linked to the use of the agent), The term 'modulate' also 
5 includes reference to a change of spacial or temporal organisation of DNA replication. 



According to an alternative aspect of the invention there is provided a screening method 
for the identification of agents which modulate DNA replication wherein the screening 
. method comprises the use of Cizl nucleotide or polypeptide sequence or fragments or 
10 variants thereof. 



Preferably the screening method comprises detecting or measuring the effect of an agent 
on a nucleic acid molecule selected from the groups consisting of: 

a) a nucleic acid molecule comprising a nucleic acid sequence represented in any 
15 of Figures 14, 15, or 21; 

b) a nucleic acid molecule which hybridises to the nucleic acid sequence in (a) 
and which has Cizl activity or activity of a variant thereof; 

c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate 
because of the genetic code to the sequences in a) and b) and a candidate agent to 

20 be tested; 

d) a nucleic acid molecule derived from the genomic sequence at the Cizl locus 
or a nucleic acid molecule that hybridises to the genomic sequence. 
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In one embodiment of the invention, the nucleic acid molecule is modified by deletion, 
substitution or addition of at least one nucleic acid residue of the nucleic acid sequence. 

Alternatively the screening method comprises the steps of: 

(i) forming a preparation comprising a polypeptide molecule, or an active fragment 
thereof, encoded by a nucleic acid molecule selected from the group consisting of: 

a) a nucleic acid molecule comprising a nucleic acid sequence represented in 
Figs 14, 15 or 21; 

b) a nucleic acid molecule which hybridises to the nucleic acid sequence in (a) 
and which has Cizl activity or activity of a variant thereof; 

c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate 
because of the genetic code to the sequences in a) and b) and a candidate agent to 
be tested; 

d) a nucleic acid molecule derived from the genomic sequence at the Cizl locus 
or a nucleic acid molecule that hybridises to the genomic sequence; and 

ii) detecting or measuring the effect of the agent on the activity of said polypeptide. 

Assays for the detection of DNA replication are known in the art. Activity residing in 
Cizl, or derived peptide fragments, and the effect of potential therapeutic agents on that 
activity would be assayed in vitro or in vivo. 

In vitro assays for Cizl protein activity would comprise synchronised isolated Gl phase 
nuclei and either S phase extract or Gl phase extract supplemented with cyclin- 
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dependent kinases. Inclusion of Cizl or derived peptide fragments stimulates initiation 
of DNA replication in these circumstances and can be monitored visually (by scoring 
nuclei that have incorporated fluorescent nucleotides during in vitro reactions) or by 
measuring incorporation of radioactive nucleotides. The assay for therapeutic reagents 
5 that interfere with Cizl protein function would involve looking for inhibition of DNA 
replication in these assays. The effect of agents on Cizl nuclear localisation, chromatin 
binding, stability, modification and protein-protein interactions could also be monitored 
in these assays. 

10 In vivo assays will include creation of cell and mouse models that over-express or under- 
express Cizl, or derived fragments, resulting in altered cell proliferation. The 
preparation of transgenic animals is generally known in the art and within the ambit of 
the skilled person. The assay for therapeutic reagents would involve analysis of cell- 
cycle time, initiation of DNA replication and cancer incidence in the presence and 

15 absence of drugs that either impinge on Cizl protein activity, or interfere with Cizl 
production by targeting Cizl and its variants at the RNA level. 

In a preferred method of the invention said hybridisation conditions are stringent. 

20 Stringent hybridisation/washing conditions are well known in the art. For example, 
nucleic acid hybrids that are stable after washing in 0.1xSSC,0.1% SDS at 60°C. It is 
well known in the art that optimal hybridisation conditions can be calculated if the 
sequence of the nucleic acid is known. Typically, hybridisation conditions uses 4 - 6 x 
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SSPE (20x SSPE contains 175.3 g NaCl, 88.2g NaH 2 P0 4 H 2 0 and 7.4g EDTA dissolved 
to 1 litre and the pH adjusted to 7.4); 5-1 Ox Denhardts solution (50x Denhardts solution 
contains 5g Ficoll (Type 400, Pharmacia), 5g polyvinylpyrrolidone abd 5g bovine serum 
albumen; lOOfig-l.Omg/ml sonicated salmon/herring DNA; 0.1-1.0% sodium dodecyl 
5 sulphate; optionally 40-60% deionised formamide. Hybridisation temperature will vary 
depending on the GC content of the nucleic acid target sequence but will typically be 
between 42°- 65° C. 

In a preferred method of the invention said polypeptide is modified by deletion, 
1 0 substitution or addition of at least one amino acid residue of the polypeptide sequence. 

A modified or variant, i.e. a fragment polypeptide and reference polypeptide, may differ 
in amino acid sequence by one or more substitutions, additions, deletions, truncations 
which may be present in any combination. Among preferred variants are those that vary 

15 from a reference polypeptide by conservative amino acid substitutions. Such 
substitutions are those that substitute a given amino acid by another amino acid of like 
characteristics. The following non-limiting list of amino acids are considered 
conservative replacements (similar): a) alanine, serine, and threonine; b) glutamic acid 
and asparatic acid; c) asparagine and glutamine d) arginine and lysine; e) isoleucine, 

20 leucine, methionine and valine and f) phenylalanine, tyrosine and tryptophan: Preferred 
are variants which retain the same biological function and activity as the reference 
polypeptide from which it varies. Alternatively, variants include those with an altered 
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biological function, for example variants which act as antagonists, so called "dominant 
negative" variants. 

Alternatively or in addition, non-conservative substitutions may give the desired 
5 biological activity see Cain SA, Williams DM, Harris V, Monk PN. Selection of novel 
ligands from a whole-molecule randomly mutated C5a library. Protein Eng. 2001 
Mar;14(3):189-93, which is incorporated by reference. 

A functionally equivalent polypeptide sequence according to the invention is a variant 
10 wherein one or more amino acid residues are substituted with conserved or non- 
conserved amino acid residues, or one in which one or more amino acid residues 
includes a substituent group. Conservative substitutions are the replacements, one for 
another, among the aliphatic amino acids Ala, Val, Leu and lie; interchange of the 
hydroxl residues Ser and Thr; exchange of the acidic residues Asp and Glu; substitution 
15 between amide residues Asn and Gin; exchange of the basic residues Lys and Arg; and 
replacements among aromatic residues Phe and Tyr. 

In addition, the invention features nucleotide or polypeptide sequences having at least 
50% identity with the nucleotide or polypeptide sequences as hereindisclosed, or 
fragments and functionally equivalent polypeptides thereof. In one embodiment, the 
20 nucleotide or polypeptide sequences have at least 75% to 85% identity, more preferably 
at least 90% identity, even more preferably at least 95% identity, still more preferably at 
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least 97% identity, and most preferably at least 99% identity with the nucleotide; and 
amino acid sequences illustrated herein. 

En a preferred method of the invention said nucleic acid molecule comprises the nucleic 
5 acid sequence encoding the amino acid sequence Cizl in Fig 16 or Fig 17 or any 
variants thereof, including those described in Figures 20A and 20B . In a further 
preferred method of the invention said nucleic acid molecule consists of the nucleic acid 
sequence which encodes the amino acid sequence Cizl in Fig 16 or Fig 17 or variants 
thereof, including those described in Figures 20A and 20B. 

10 

In a further preferred method of the invention said polypeptide molecule comprises the 
amino acid sequence Cizl in Fig 16 or 17 or variants thereof, including those described 
in Figures 20A and 20B . In a further preferred method of the invention said 
polypeptide molecule consists of the amino acid sequence Cizl in Fig 16 or 17 or 
1 5 variants thereof, including those described in Figures 20A and 20B. 

In a further preferred method of the invention said polypeptide is expressed by a 6ell, 
preferably a mammalian cell, or animal and said screening method is a cell-based 
screening method. 

20 

Preferably said cell naturally expresses the Ciz 1 polypeptide. Alternatively said cell is 
transfected with a nucleic acid molecule encoding a Ciz 1 polypeptide (or a variant 
molecule thereof, found, for example in cancer cell lineages). 

15 
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According to a further aspect of the invention there is provided an agent obtainable by 
the method according to the invention. 

Preferably said agent is an antagonist of Cizl mediated DNA replication. Alternatively 
said agent is an agonist of Cizl mediated DNA replication. 

In a further preferred method of the invention said agent is selected from the group 
consisting of: polypeptide; peptide; aptamer; chemical; antibody; nucleic acid; or 
polypeptide or nucleotide probe. 

Preferably the agent comprises a sequence that is complimentary or of sufficient 
homology to give specific binding to the target and can be used to detect the level of 
nucleic acid or protein for diagnostic purposes. 

Alternatively the agent identified by the method of the invention is a therapeutic agent 
and can be used for the treatment of disease. 

hi one embodiment of the invention the agent is an antibody molecule and binds to any 
of the sequences represented by figures 16, 17 or 20. 

Preferably said antibody is a monoclonal antibody. 
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Alternatively said agent is an anti-sense nucleic acid molecule which binds to and- 
thereby blocks or inactivates the mRNA encoded by any of the nucleic acid Sequences hi 
(i) above. 

5 In an alternative embodiment, said agent is an RNAi molecule and comprises two 
complementary strands of RNA (a sense strand and an antisense strand) annealed to each 
other to form a double stranded RNA molecule. Preferably the RNAi molecule is 
derived from the exonic sequence of the Cizl gene or from another over-lapping gene. 

10 In one embodiment unspliced mRNA is targetted with RNAi to inhibit production of the 
spliced variant. In another the spliced variant mRNA is ablated without affecting the 
non-variant mRNA. 

In a preferred method of the invention said peptide is an oligopeptide. Preferably, said 
15 oligopeptide is at least 10 amino acids long. Preferably said oligopeptide is at least 20, 
30, 40, 50 amino acids in length. 

In a further preferred method of the invention said peptide is a modified peptide. 

20 It will be apparent to one skilled in the art that modified amino acids include, by way of 
example and not by way of limitation, 4-hydroxyproline, 5-hydroxylysine, N 6 - 
acetyllysine, N 6 -methyllysine,. N 6 ,N 6 -dimethyllysine, N 6 ,N 6 ,N 6 -trimethyllysine, 
cyclohexyalanine, D-amino acids, ornithine. Other modifications include amino acids 
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with a C2, C3 or C 4 alkyl R group optionally substituted by 1, 2 or 3 substituents selected 
from halo ( eg F, Br, I), hydroxy or C1-C4 alkoxy. 

Alternatively said peptide is modified by acetylation and/or amidation. 

In a preferred method of the invention the polypeptides or peptides are modified by 
cyclisation. Cyclisation is known in the art, (see Scott et al Chem Biol (2001), 8:801- 
815; GeUerman et al J. Peptide Res (2001), 57: 277-291; Dutta et al J. Peptide Res 
(2000), 8: 398-412; Ngoka and Gross J Amer Soc Mass Spec (1999), 10:360-363). 

According to a further aspect of the invention there is provided a vector as a delivery 
means for, for example, an antisense or an RNAi molecule which inhibits Cizl or 
variants thereof and thereby allows the targetting of cells expressing the protein to be 
targeted. 

In one embodiment of the invention a viral vector is used as delivery means. 

Preferably the vector includes an expression cassette comprising the nucleotide sequence 
selected from the group consisting of; 

a) the nucleic acid sequence which encodes Cizl amino acid sequence as shown 
in Fig 14, 15 and 21; 

b) a nucleic acid molecule which hybridizes to the nucleic acid sequence of (a) ; 
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c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate 



because of the genetic code to the sequences in a) and b) and any sequence which 



is complimentary to any of the above sequences; 



d) a nucleic acid sequence that encodes Cizl pre-mRNA (i.e., the genomic 



5 



sequence), 



wherein the expression cassette is transcriptionally linked to a promoter 
sequence. 

Preferably the vectors including the expression cassette is adapted for eukaryotic gene 
10 expression. Typically said adaptation includes, by example and not by way of 
limitation, the provision of transcription control sequences (promoter sequences) which 
mediate cell/tissue specific expression. These promoter sequences may be cell/tissue 
specific, inducible or constitutive. 

15 Promoter elements typically also include so called TATA box and RNA polymerase 
initiation selection sequences which function to select a site of transcription initiation, 
These sequences also bind polypeptides which function, inter alia, to facilitate 
transcription initiation selection by RNA polymerase. 



20 



Adaptations also include the provision of selectable markers and autonomous replication 
sequences which both facilitate the maintenance of said vector in either the eukaryotic 
cell or prokaryotic host. Vectors which are maintained autonomously are referred to as 
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episomal vectors. Further adaptations which facilitate the expression of vector encoded 
genes include the provision of transcription termination sequences. 

These adaptations are well known in the art. There is a significant amount of published 
5 literature with respect to expression vector construction and recombinant DNA 
techniques in general. Please see, Sambrook et al (1989) Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbour Laboratory, Cold Spring Harbour, NY and 
references therein; Marston, F (1987) DNA Cloning Techniques: A Practical Approach 
Vol IE IRL Press, Oxford UK; DNA Cloning: F M Ausubel et al, Current Protocols in 
10 Molecular Biology, John Wiley & Sons, Inc.(1994). 

• According to the present invention there is provided a diagnostic method for the 
identification of proliferative disorders comprising detecting the presence or expression 
of the Cizl gene, Cizl splice variants and mutations in the genomic or protein sequence 
15 thererof. 

Preferably said diagnostic method comprises one of more of the following steps: 

(i) contacting a sample isolated from a subject to be tested with an agent which 
specifically binds a polypeptide with Ciz 1 activity or a nucleic acid molecule 

20 encoding a polypeptide with Ciz 1 activity; and 

(ii) detecting or measuring the binding of the agent on said polypeptide or nucleic 
acid in said sample; ■ 
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(iii) use of reverse-transcribed PCR or real-time PCR to monitor Cizl isoform 
expression and to measure expression levels. 

(iv) measuring the presence of nucleic acid or amino-acid mutations based on altered 
conformational properties of the molecule. 

In one embodiment, the diagnostic method of the present invention is carried out in-vivo. 

In an alternative embodiment, the diagnostic method of the present invention is carried 
out ex-vi vo or in-vitro. 

Preferably the diagnostic method provides for a quantitative measure of Cizl RNA or 
protein variants in a sample. 

In one embodiment of the invention there is provided the use of an agent which 
modulates Cizl RNA or protein, or variants thereof, as a pharmaceutical. 

Preferably said pharmaceutical comprises an agent identified by the screening method of 
the present invention in combination or association with a pharmaceutically acceptable 
carrier, excipient or diluent. 

Preferably said pharmaceutical is for oral or topical administration or for administration 
by injection. In alternative embodiment of the invention the pharmaceutical is 
administered as an aerosol. 
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In a further preferred embodiment of the invention there is provided the use of an agent 
according to the invention for the manufacture of a medicament for use in the treatment 
of proliferative disease. Preferably said proliferative disease is cancer. 

5 

Preferably said cancer is a paediatric cancer and is selected from the group consisting of; 
retinoblastoma, neuroblastoma, Burkitt lymphoma, medulloblastoma, Ewings Sarcoma 
family tumours (ESFTs). 

10 In an alternative embodiment the cancer is a carcinoma, adenocarcinoma, lymphoma or 
leukemia. 

In an alternate embodiment the disease is liver, lung or skin cancer or metastasis. 

15 According to a further aspect of the invention there is provided a method to treat a 
proliferative disease comprising administering to an animal, preferably a human, an 
agent obtainable by the method according to the invention. 

According to an alternate aspect of the invention, there is provided the use of an agent 
20 according to the invention for the manufacture of a medicament to slow cell division or 
growth. 
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The invention also includes the use of the Cizl amino acid sequence and protein 
structure in rational drug design and the use of Cizl nucleotide and amino acid 
sequences thereof or variants thereof for screening chemical libraries for agents that 
specifically bind to Cizl ; 

5 

The invention also includes a kit comprising a diagnostic, prognostic or therapeutic 
agent identified by the method of the invention. 

In an alternative embodiment of the invention, an array based sequencing chip is used 
10 for the detection of altered Cizl . 

Detailed Description of the invention 

An embodiment of the invention will now be described by example only and with 
reference to the following figures: 

15 

Fig. 1 Illustrates the effect of cyclin A-cdk2 on late Gl nuclei. A) Anti-Cdc6 antibody 
VI detects mouse Cdc6 and a second antigen in western blots of 3T3 whole cell extract, 
which migrates with approximate Mr of lOOkDa (based on the mobility of the Mcm3 
protein this was previously estimated at nearer 85kDa so the antigen was named p85 - 
20 we have kept the same name here for clarity). P85 is present in both the soluble fraction 
and insoluble nuclear fraction (prepared under in vitro replication conditions). B) 
Initiation of DNA synthesis in 'replication competent' late Gl phase nuclei by Gl phase 
extract supplemented with recombinant cyclin A-cdk2. Control bar shows the proportion 
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of nuclei already in S phase (unshaded), and those that initiated replication in extract 
from S phase, cells (shaded).. C) After 15 minutes under cell-free replication conditions 
nuclei were washed and the chromatin fraction was re-isolated and separated by SDS- 
Page and blotted for Mcm2 and Mcm3. D) The same nuclei blotted with antibody VI. 
p85 antigen is more abundant in nuclei exposed to initiation-inducing concentrations of 
cyclin A-cdk2. Antibody VI was used to clone the gene for p85 from a mouse embryo 
expression library which was identified as Cizl . 

Kg. 2 Alignment of mouse Cizl variants. The predicted full-length Cizl amino-acid 
sequence ("Full') is identical to a mouse mammary tumour cDNA clone (BC018483), 
while embryonic Cizl ('ECizl ? , AJ575057), and a melanoma-derived clone 
(AK089986) lack two discrete internal sequences. In addition, the first available 
methionine in ECizl is in the middle of exon 3 (Met84),which excludes a polyglutamine 
rich region from the N-terminus. Melanoma derived AK089986 may be incomplete as it 
ends 77 codons before the C-terminus of all other mouse and human clones. Stars 
indicate amino-acids changed by site-directed mutagenesis in the constructs shown in D. 
Amino-acids that correspond to codons targeted by siKNAs are underlined. B) Mouse 
Cizl is encoded by at least 17 exons. Coding exons are shown in grey, alternatively 
spliced regions are black, untranslated regions are white. Two alternative exon 1 
sequences are included in some Cizl transcripts (not shown) but an alternative 
translational start site upstream of the two depicted here has not yet been found. C) 
Sequence features and putative domains in ECizl. Predicted nuclear localisation 
sequence (NLS), putative cyclin-dependent kinase phosphorylation sites, C2H2 type 
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zinc-fingers and a C terminal domain with homology to the nuclear matrix protein 
matrin 3 (Nakayasu and Berezney, 1991) are shown. The positions of sequences absent 
from ECizl are indicated by triangles. D) ECizl and derived truncations and point 
mutants used in cell-free DNA replication experiments. Numbers in parentheses relate to 
5 amino-acid positions in the full-length form of mouse Cizl, shown in A. Stars indicate 
putative phosphorylation sites ablated by site-directed mutagenesis. 

Fig. 3 Shows the effect of Cizl protein and derived fragments in cell-free DNA 
replication experiments and illustrates that ECizl promotes initiation of mammalian 

10 DNA replication A) Recombinant ECizl stimulates initiation of DNA replication in 
'replication competent 9 late Gl phase nuclei, during incubation in S phase extract. 
Histogram shows the average number of nuclei that incorporated biotinylated 
nucleotides in vitro (black), in the presence or absence of ectopic ECizl, with standard 
deviations calculated from four independent experiments. Thel7% of nuclei that were 

15 already in S phase when the nuclear preparation was made are shown in white. Images 
show nuclei replicating in vitro, with or without 1 nM ECizl. Total nuclei are 
counterstained with propidium iodode (red). B) The response to recombinant ECizl is 
concentration dependent with a sharp optimum in the nM range. In this experiment, and 
all those shown in B-I, results are expressed as % initiation rather than % Replication. 

20 This is calculated from the number of nuclei that initiate in vitro and the number of 
nuclei that are 'competent' to initiate in vitro (see methods). C) Threonines 191/2 are 
involved in regulating Cizl DNA replication activity as ECizl. cdk site mutant 
T(191/2)A escapes suppression at high concentrations. D) Cdk site mutant T(293)A 
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stimulates initiation with a similar profile to ECizl but at lower concentrations. E) 
Truncated ECizl (Ntenn 442) lacks C-terminal sequences, but stimulates in vitro 
initiation to a similar extent as ECizl. F) Cterm 274 retains no DNA replication activity 
in this assay. G, H, I) Further deletion analysis in the N-terminal two thirds of the ECizl 
5 protein show that a short region 3' of exon 8 is required for Cizl function when assayed 
in vitro. 

Fig. 4 Characterisation of anti-Cizl polyclonal antibodies and identification of 125kDa 
Cizl -related bands A) Coomassie stained SDS-polyacrylamide gel showing purified 

10 recombinant ECizl fragment Nterm442, and western blots of recombinant Nterm442 
using anti-Cdc6 antibody VI, and anti-Cizl antibodies 1793 and 1794. B) Western blot 
of 3T3 whole cell extract. Of the two bands detected by anti-Cizl antibody 1793 one has 
the same mobility as p85-Cizl (lOOleDa) recognized by antibody VI and the other has 
an apparent Mr of 125kDa. Anti-Cizl antibody 1794 recognizes only thel25KDa form of 

15 Cizl (and a second antigen of around 80 kDa). C) Immuno-precipitation from 3T3 
nuclear extract, using antibody VI or anti-Cizl 1793. Both antibodies precipitate p85, 
which is recognized by the reciprocal antibody in western blots. P125 is precipitated by 
antibody 1793, and to a lesser extent by antibody VI and these are recognized by 1793 
in western blots. Mcm3 is shown as a control. 

20 

Fig. 5 Immunofluorescence analysis of endogenous Cizl. Cizl resides in sub-nuclear 
foci that overlap with sites of DNA replication A) Endogenous Cizl (red) in 3T3 cells 
fixed before (untreated) or after (detergent treated) exposure to TritonXlOO, detected 
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with anti-Cizl antibody 1793. Nuclei are counterstained with Hoescht 33258 (blue). 
Cdc6 (green), detected with a Cdc6-specific monoclonal antibody is shown for 
comparison. B) Inclusion of recombinant Cizl blocks reactivity of antibody 1793 with 
detergent treated nuclei. C) Detergent-resistant Cizl (red) is present in all nuclei in 
cycling populations, while detergent resistant PCNA (green) persists only in S phase 
nuclei. D) High magnification confocal sections of detergent resistant Cizl and PCNA, 
and merged image showing co-localising foci (yellow). E) Line plots of red and green 
fluorescence across the merged image in D, at the positions indicated (i and ii). F) Cross- 
correlation plot (Rubbi and Milner, 2000; van Steensel et al., 1996) for green foci 
compared to red over the whole merged image in D, and (inset) for the marked section 
after thresh-holding fluorescence at the levels shown in Eii. The red line in the inset to F 
shows loss of correlation when the Cizl image is rotated 90° with respect to PCNA. Bar 
islOfxM. 

Fig. 6 KNA interference. Cizl depletion inhibits S phase A) siRNAs that target Cizl 
transcripts at four sites (see Fig. 2A) were individually applied to cycling 3T3 cells as a 
single 3nM dose and cell number was monitored at the indicated times. Images of cell 
populations at 16 and 40 hours after transfection with siRNA 8 (red outline) or mock 
treated cells (blue outline). are shown. B) Cizl protein detected with anti-Cizl 1793 
(green) 48 hours after exposure to Cizl siRNAs (4 and 8), or control GAPDH siRNA. Q 
Cizl, GAPDH and p-actin transcript levels in cells exposed to Cizl siRNAs (4 and 8), or 
control GAPDH siRNA for 24 hours. Numbers in parentheses reflect band intensity in 
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arbitrary units, and the overall reduction in Cizl and GAPDH transcripts (normalised 
against p-actin) is expressed as a percentage. D) The proportion of cells that 
incorporated BrdU into DNA (green) is significantly decreased in Cizl depleted cells, 4S 
hours after treatment with Cizl siRNA. Histogram shows average results from four 
5 independent experiments. E) The number of nuclei with detergent resistant Mcm3 
(green) increases in populations treated with Cizl siRNA. F) The proportion of nuclei 
with detergent resistant PCNA (green) also increases under these conditions. All nuclei 
are counterstained and shown in pseudo-colour (red). 

10 Fig. 7 RT-PCR analysis of Cizl exons 3/4 splice variant expression in mouse primordial 
germ cells and embryonic stem cells. Exons 3 and/or 4 are alternatively spliced in these 
cell types, but not in neonatal heart. These data are consistent with the hypothesis that 
full-length Cizl is the pre-dominant form in neonatal somatic tissue, and that variants 
occur with more frequency earlier in development, and in germ line tissues. 

15 

Fig. 8 Transient transfection of mouse 3T3 cells. A. GFP-tagged Cizl constructs were 
transfected into NIH3T3 cells or B. microinjected into the male pro-nucleus of fertilized 
mouse eggs at the one cell stage. By 24 hours Cizl and ECizl became localized to the 
nucleus forming a subnuclear spotty pattern, while GFP alone was present in both the 
20 nucleus and the cytoplasm. C. High magnification images of live 3T3 cell nuclei 24 
hours after transfection showing the subnuclear organisation of EGFP tagged Cizl and 
ECizl and derived fragments with the C-terminal fragment (equivelent to Cterm274) 
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removed. In the absence of C-terminal domains GFP-ECizl is diffusely localised in. the 
nucleus 24 hours after transfection, while GFP-Cizl aggregates to form one or two large 
blobs within the nucleus. D. The Cterminal 274 domain alone is cytoplasmic until after 
cells have passed through mitosis (most likely due to lack of nuclear localisation 
5 sequences and passive entry to the nucleus), but once inside binds to nuclear structures 
and condenses with chromosomes. E. Representative images of GFP-Cizl (green), BrdU 
(red) and total nuclei (blue) in a population labelled with BrdU for the first 12 hours 
after transfection are shown. Histograms show the proportion of transfected (green) cells 
that incorporated BrdU compared to the number of untransfected (grey) cells for three 

10 separate labelling windows. During 0-22 hours after transfection rapidly cycling cells 
registered a consistent increase in the BrdU labelled fraction when transfected with 
either Cizl or ECizl. Similar results were obtained with dense cultures in which most . 
cells had exited the cell cycle and entered quiescence. However, when rapidly cycling 
cells were exposed to BrDu for a short (20 minute) pulse 22 hours after transfection the 

15 number of cells engaged in DNA synthesis was reduced in the Cizl and ECizl 
transfected populations, compared to untransfected controls and cells transfected with 
GFP alone. This indicates that by 22 hours DNA synthesis had ceased in Cizl 
expressing cells. 

20 Fig. 9 Altered proliferation potential and cell morphology in transfected populations. 
Cell clusters arising in transfected 3T3 cell populations. A. Cells were transfected with 
the N-terminal two thirds of Cizl or ECizl (N-term442) tagged with GFP, and 
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maintained under selection with 50 p,g/ml G418. After three weeks under selection, cell 
- aggregates were visible with GFP positive cells within. 

Fig. 10 Human Cizl splice variants in paediatric cancers. There are seven human Cizl 
5 cDNAs in public databases, but only one is derived from normal adult tissue (B cells) 
and it contains all predicted exons. The other six are derived from embryonic cells or 
paediatric cancers. Five of these are alternatively spliced with variability in exons 2, 3, 
6, and 8 (like mouse ECizl), and also in exon 4 (like mouse ES cells, primordial germ 
cells and testis). The sixth (AF159025) lacks the first methionine and contains single- 
10 nucleotide polymorphisms that give rise to amino-acid substitutions. All differences 
from the predicted sequence (AB030835) are marked. 

Fig. 11 EST sequence analysis. On each map a schematic representation of the Cizl 
protein is included for reference, showing the positions of alternatively spliced exons 

15 (black), putative chromatin interaction domains (grey) and predicted zinc fingers (black 
vertical lines). All EST sequences are accompanied by their Genbank accession number 
with the library from which they were derived indicted in parentheses. Sequences absent 
from Cizl ESTs due to alternative splicing are shown in yellow, frame-shifts in red and 
putative deletions in grey. Single nucleotide polymorphisms that give rise to amino-acid 

20 substitutions are indicated by black dots and some of these occur in a consensus cdk 
phosphorylation site which we have shown to be important for the regulation of Cizl 
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activity (blue dots). Position of the inserted sequence in the carcinoma cell line MGC102 
is indicated by a triangle. 

A) Translated ESTs from paediatric cancers and adult neural cancers. 
5 B) Translated ESTs from various non-cancer cells and tissues 

C) Translated ESTs from leukemias, lymphomas, and from normal haematopoetic 
and lymphocytic cells 

D) Translated ESTs from carcinomas 

E) Translated ESTs from a range of other cancers 

10 F) Summary of alternatively spliced regions in human Cizl showing conditionally 

included sequences. 

Fig. 12 Cizl splice variant expression in Ewings sarcoma family tumour cells lines 
(ESFT) and neuroblastoma cell lines. A. Whole RNA samples from six independent 
15 ESFT cell lines, two neuroblastomas and a control cell line (HEK293 cells) was subject 
to RT-PCR analysis using 4 different primer sets. 

ESFT cell lines are 1) A673, 2) RDES, 3) SKES1, 4) SKNMC, 5) TC3, 6) TTC466. 
Neuroblastoma cell lines are 1) IMR32, 2) SKNSH. 

B. Analysis of Cizl Exons 3/4/5 PCR products in ESFTs and neuroblastoma. The 
20 products of primers h3 and h4 (spanning potentially variable exons 4 and 6) were 
analysed in more detail. PCR fragments were purified from agarose gels by standard 
procedures, subcloned and sequenced to identify the source of fragment size variations. 
Between one and eleven individual clones for each of the seven cell lines were 
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sequenced and the results are summarised in tabular form. Cizl from ESFT cell lines 
lacks exon 4 in 31% of transcripts overall, and for some ESFT lines this is nearer 50%. 
DSSSQ is more commonly absent in the two neuroblastoma cell lines tested here. 

Fig. 13 Cizl isoforms in normal human fibroblasts (Wi38) and metastatic prostate 
cancer cell lines (PC3 and LNCAP), A. Both prostate cancer cell lines contain an excess 
of the largest pl25 Cizl protein variant in the nuclear fraction, compared to the non- 
cancer cell line. B. Models for the production of p85 (100) from pl25 variants by protein 
processing during initiation of DNA replication. 

Fig. 14 illustrates the full length mouse mRNA sequence. 
Fig. 15 illustrates the full length human mRNA sequence. 
Fig. 16 illustrates the full length mouse protein sequence. 
Fig. 17 illustrates the full length human protein sequence. 

Fig. 18 illustrates human alternatively spliced protein sequences. Sequences shown are 
absent in the spliced protein sequences. 

Fig. 19 illustrates human alternatively spliced mRNA sequences. Sequences shown are 
absent in the spliced protein sequences. 
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Fig. 20 A and B illustrate unique junction sequences created in human Cizl proteins by 
missing exons. Junction sequences represent prime sites of target for therapeutic agents 
identified by the method of the invention. 

5 Fig 21 A to H illustrate junction sequences created in human Cizl mRNA. 

Identification of Cizl We have exploited a polyclonal antibody (antibody VI) that was 
raised against recombinant human Cdc6 (Coverley et al., 2000; Stoeber et al., 1998; 
10 Williams et ah, 1998) to identify and study an unknown antigen whose behaviour 
correlates with initiation of DNA replication in vitro. The antigen has an apparent Mr of 
lOOkDa (called p85) and is readily detectable in extracts from 3T3 cells (Fig. 1A). 

DNA synthesis can be activated in cell-free replication experiments using 'replication 
15 competent' late Gl phase nuclei, Gl extracts, and recombinant cyclin A-cdk2. Under 
these conditions nuclei will incorporate labelled nucleotides into nascent DNA, in a 
manner strictly dependent on the concentration of active protein kinase (Fig. IB). Above 
and below the optimum concentration no initiation of DNA replication takes place. 
However, other events occur which inversely correlate with initiation (Coverley et al., 
20 2002). Here we use activation of DNA synthesis (Fig. IB), and Mcro2 phosphorylation 
(which results in increased mobility, Fig. 1C), to calibrate the effects of recombinant 
cyclin A-cdk2 in cell-free replication experiments, and correlate the behaviour of p85 
with activation of DNA synthesis. 
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. In Gl nuclei that are re-isolated from reactions containing initiation-inducing 
concentrations of cyclin A-cdk2, p85 antigen is more prevalent compared to nuclei 
exposed to lower or higher concentrations of kinase (Fig. ID). This suggests that p85 is 
5 regulated at some level by cyclin A-cdk2, in a manner that is co-incident with activation 
of DNA synthesis. No other antigens correlate so closely with this stage in the cell-free 
initiation process, therefore we used antibody VI to clone the gene for mouse p85. 

When applied to a cDNA expression library derived from 11 -day mouse embryos 
10 antibody VI picked out two clones that survived multiple rounds of screening (see 
methods). One encoded mouse Cdc6, while the other encoded 716 amino acids of the 
murine homologue of human Cizl (Mitsui et al., 1999). Full-length human and mouse 
Cizl have approximately 70% overall homology at the amino-acid level, with greatest 
(>80%) homology in the N and C terminal regions. Cizl is conserved among vertebrates 
15 as homologues exist in rat. and fugu, but no proteins with a high degree of homology or 
similar domain structure could be identified in lower eukaryotes, raising the possibility 
that Cizl evolved to perform a specialised role in vertebrate development. 

A previous publication on human Cizl (Mitsui et al 1999) demonstrated interaction with 
20 the cell-cycle protein p21-CIPl, leading to investigation of a proposed role as a 
transcription factor, not a DNA replication factor. A second paper (Warder and Keherly 
2003) published after the priority date of this patent application suggests a role for Cizl 
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in tumorigenesis, but does not demonstrate a role in DNA replication or recognise the 
importance of Cizl splice variant expression. 

Multple Cizl isofoims The predicted mouse Cizl open reading frame and a cDNA 
5 derived from a mouse mammary tumour library (B CO 18483) contain three regions that 
are not present in our embryonic clone (AJ575057), hereafter referred to as ECizl (Fig. 
2A). The three variable regions in ECizl appear to be the result of alternative splicing of 
exons 2/3, 6 and 8 (Fig. 2B). Mouse melanoma clone AK089986 lacks two of the same 
three regions as ECizl (Fig. 2A), while the third encodes an N-terminal polyglutamine 

10 stretch that is also absent from human medulloblastoma derived clones. A fourth 
sequence block derived from exons 3/4 is absent from Cizl transcripts derived from 
mouse ES cells, and from exon 4 in mouse primordial germ cells (fig. 7). Human Cizl is 
also alternatively spliced at the RNA level to yield transcripts that exclude combinations 
of the same four sequence blocks as mouse Cizl (see below). In fact, all known 

15 variations in mouse Cizl cDNAs have close human parallels, some of which are 
identical at the amino-acid level. This suggests that the different Cizl isoforms have 
functional significance. A fifth variable regions (not yet observed in the 1 mouse) is 
alternatively spliced in human Cizl transcripts derived mainly from carcinomas. 

20 The data suggest that shorter forms of Cizl (lacking the alternatively spliced exons) are 
most prevalent early in development and in cell lineages that give rise to the germ line. 
In the analysis shown in fig. 7, only Cizl from fully developed neonatal heart shows no 
alternative splicing, while all embryonic cell types contain alternatively spliced forms. 
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Furthermore, the only complete Cizl cDNAs in public databases (hitman or mouse) are 
derived from non-embryonic cell types, and the only ones derived from embryonic 
sources are alternatively spliced. Therefore, Cizl splice variant expression appears to 
occur preferentially in cell types that are not yet fully differentiated. 

Notably, Cizl cDNAs from paediatric cancers are also alternatively spliced (see below). 
This lead us to the hypothesis that failure to express the appropriate Cizl isoform at the 
right point in development leads to inappropriately regulated Cizl activity. This could 
contribute to unscheduled proliferation and cellular transformation. 

ECizl stimulates DNA replication in vitro Upon exposure to cytosolic extract from S 
phase cells, late Gl phase nuclei initiate DNA replication and begin synthesizing nascent 
DNA (Krude et al., 1997). We used tins cell-free assay to test the effect of ECizl, and 
derived recombinant fragments, on DNA synthesis (Fig. 3). Full-length ECizl protein 
consistently increased the number of nuclei that replicated in vitro, from 30%(+/-0.9%) 
to 46% (+/- 5.5%), which suggests that Cizl is limiting for initiation in S phase extracts 
(Fig. 3A). Only two other classes of protein (cyclin-dependent kinases , Coverley et al., 
2002; Krude et al., 1997; Laman et al., 2001, and the Cdc6 protein, Coverley et al., 
2002; Stoeber et al., 1998) have been previously found to stimulate cell-free initiation. 
Thus, ECizl is the first protein to have this property that was not already known to be 
involved in the replication process. The positive effect of recombinant ECizl on cell- 
free initiation argues that endogenous Cizl plays a positive role in DNA replication in 
mammalian cells. 



36 



WO 2004/051269 




PCT/GB2003/005334 



Stimulation of cell-free initiation is concentration-dependent with peak activity in S 
phase extract at around InM ECizl (Fig. 3B). This echoes previous cell-free analyses 
with other recombinant proteins (Coverley et al., 2002; Krude et al., 1997), where 
stimulation of initiation typically peaks and then falls back to the un-stimulated level at 
high concentrations. For ECizl, the reason for the drop in activity at high concentrations 
is not yet clear. However, mutagenesis studies (see below) suggest that the restraining 
mechanism is likely to be active and specific rather than due to a general imbalance in 
the composition of higher order protein complexes. 

Down regulation of ECizl involves threonines 191/192 Cizl is likely to be a phospho- 
protein in vivo since it contains numerous putative phosphorylation sites, and it displays 
altered mobility when 3T3 cell extracts are treated with lambda phosphatase (not 
shown). Murine Cizl contains two RXL cyclin binding motifs and five putative cdk- 
phosphorylation sites, which are present in all known variants. Four of these are located 
in the N-terminal fragment of ECizl that contains in vitro replication activity (see 
below), and one is adjacent to the site at which exon 6 is alternatively spliced to exclude 
a short DSSSQ sequence motif (Fig. 2A, C). As this motif is 100% identical and 
alternatively spliced in both mouse and man we reasoned that conditional inclusion 
might serve to regulate Cizl activity, identifying this region of the protein as potentially 
important. We therefore chose to focus on the cdk site that is four residues upstream and 
which is also conserved in mouse and man, by combining a genetic approach With cell- 
free replication assays. Starting with ECizl, two threonines at 191 and 192 were 
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changed to two alanines, generating ECizl T(l 9 1/2) A (Fig. 2D). When tested in vitro foT 
DNA replication activity, ECizlT(191/2)A stimulated initiation in late Gl nuclei to a 
similar "extent as ECizl (Fig. 3C). However unlike ECizl, stimulation of initiation was 
maintained over a broad range of concentrations that extended over at least three orders 
5 of magnitude. Therefore, a mechanism to restrict the activity of excess ECizl exists and 
operates in a cell-free environment. In a separate construct, the threonine at position 293 
was also changed to alanine generating ECizlT(293)A (Fig. 2D), but this alteration had 
little effect on ECizl activity assayed in vitro (Fig. 3D). 

10 These results demonstrate that down-regulation of ECizl activity involves threonine 
191/2, and is probably caused by cyclin-dependent kinase mediated phosphorylation at 
this site. This links Cizl activity to the cdk-dependent pathways that control all major 
cell-cycle events, including initiation of DNA replication. 

15 Most pre-replication complex proteins and many replication fork proteins are 
phosphorylated in vivo, often by cyclin-dependent kinases (Bell and Dutta, 2002; Fujita, 
1999). Our data suggests that nuclear accumulation of p85-Cizl antigen is regulated 
(directly or indirectly) by cyclin A-cdk2, and it shows that a specific consensus cdk 
phosphorylation site at threonine 191/192 is involved in controlling Cizl activity. When 

20 this site is made unphosphorylatable Cizl activity is maintained over a broader range of 
concentrations in cell-free assays. Therefore, Cizl activity is normally down regulated 
by modification at this site. The functions of the other conserved cdk phosphorylation 
sites, and the effect of conditional inclusion of an RXL cyclin-binding motif in the 
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alternatively spliced N-terminal portion of Cizl, remain to be determined. Thus, the 
simple negative relationship between Gizl activity and cdk-dependent phosphorylation 
that has been uncovered here, is unlikely to be the whole story. However, our analysis so 
far links Cizl with the cdk-dependent pathways that control all major cell-cycle 
5 transitions, and is therefore consistent with our main conclusion that Cizl is involved in 
initiation of DNA replication. 

In vitro replication activity resides in the N-terminus Cizl possesses several C-terminal 
features that may anchor the protein within the nucleus. The matrin 3 domain suggests 

10 interaction with the nuclear matrix and the three zinc-fingers imply interaction with 
nucleic acids. Indeed, recent evidence suggests that human Cizl binds DNA in a weakly 
sequence specific manner (Warder and Keherley, 2003). To determine whether C- 
terminal domains are important for ECizl replication activity we divided the protein into 
two fragments (Fig. 2D). Nterm442 (which contains the NLS, two conserved cdk sites, 

15 one zinc finger and all known sites where variable splicing has been observed) 
stimulates initiation to a similar extent and at the same concentration as ECizl (Fig. 3E). 
In contrast, the C-terminal portion (Cterm274) contains no residual replication activity 
(Fig. 3F). Therefore, the matrin 3 domain, one of the cyclin-dependent kinase 
phosphorylation sites and two of the zinc-fingers are not required for the DNA 

20 replication activity of ECizl, when assayed in vitro. It should be noted however that this 
analysis measures ECizl activity in trans under conditions where the consequences of 
mis-localisation are unlikely to be detected. Therefore, it remains possible that the 
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matrin 3 domain and zinc fingers act in viyo to direct Gizl activity to specific sites in the 
nucleus and thus limit the scope of Cizl activity. 

Endogenous Cizl Antibody VI recognises Cdc6 as well as p85-Cizl (Fig. 1A), so it is 
not suitable for. immuno-fluorescence experiments aimed at visualizing the sub-cellular 
localization of endogenous Cizl. We therefore generated two new rabbit polyclonal anti- 
sera against recombinant ECizl fragment Nterm442, designated anti-Cizl 1793 and 
1794. As expected, purified Nterm442 is recognised by anti-Cizl antibodies 1793 and 
1794 in western blots, but it is also recognised by antibody VI (Fig. 4A), supporting the 
conclusion that p85(pl00) is indeed CizL 

When applied to protein extracts derived from growing 3T3 cells anti-Cizl 1793 
recognised two antigens, with Mr of 125 and 100 kDa (Fig. 4B), whose relative 
proportions vary from preparation to preparation. The 100 kDa band co-migrates with 
the cyclin-A responsive antigen that is recognized by antibody VI (Fig. 1 and 4B), 
which suggests that both antibodies recognise the same protein in vivo. We confirmed 
that the plOO-Cizl bands recognised by antibody VI and 1793 are the same protein by 
immuno-precipitation (Fig. 4C). Antibody VI precipitated a 100 kDa band that was 
recognised in western blots by 1793^ and vice versa. Furthermore, in the same 
experiment 1793, and to a lesser extent antibody VI, precipitated a 125 kDa antigen, that 
was recognised in western blots by 1793. Taken together our observations show that the 
100 kDa band is indeed Cizl (previously known as p85), and they suggest that Cizl 
protein exists in at least two forms in cycling cells. 
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In addition to the immuno-precipitation evidence described above, several other 
observations lead to the conclusion that pl25 is also a form of Cizl. First, both of our 
anti-Cizl antibodies (1793 and 1794) have this band in common. Both antibodies 
5 produce the same pattern of nuclear staining in immuno-fluorescence experiments, and 
this is disrupted in cells treated with Cizl siRNA (see below). Second, the relative 
proportions of pi 00 and pi 25 vary from preparation to preparation, and could therefore 
be the result of proteolytic cleavage. Thirdly, our results are strikingly similar to those of 
Mitsui et al (1999) whose anti-human Cizl monoclonal antibody detected two antigens 
10 with apparent Mr of 120 and 95 kDa in HEK293 cells. They proposed that thel20 kDa 
form of human Cizl protein is processed to produce the 95kDa form and our results are 
consistent with this proposal. 

The 125kDa band recognized by antibody 1793 in mouse and human cells resolves into 
15 three Cizl -related bands during high-resolution electrophoresis of material derived from 
non-transformed human cells (Wi38-see later), and mouse cells (NIH3T3 -not shown). 
This may be the result of post-translatiorial modification of the Cizl protein or of 
alternative splicing of the Cizl transcript 

20 Sub-cellular distribution of Cizl Anti-Cizl 1793 was used to visualise the sub-cellular 
distribution of Cizl protein (p85 andpl25) in 3T3 cells (Fig. 5A), and in HeLa cells (not 
shown). In both cell types 1793 reacted with a nuclear-specific antigen, arid this was 
blocked by inclusion of recombinant Nterm442 fragment (Fig. 5B). Unlike Cdc6, which 
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is shown for comparison (Fig. 5A), Cizl is clearly detectable in all 3T3 cells in this 
cycling population. Therefore Cizl- is present in the nucleus throughout interphase, 
although minor variations in quantity, or isoform would not be detected by this method. 
After detergent treatment overall nuclear Cizl staining was reduced in all nuclei, which 
suggests that Cizl is present in the nucleus as both a soluble fraction and also bound to 
insoluble nuclear structures. 

When soluble protein is washed away, the insoluble, immobilised antigen resolves into a 
punctate sub-nuclear speckled pattern at high magnification (Fig. 5C, D). Cizl speckles 
show a similar size range and distribution as replication 'foci' or 'factories 5 , the sites at 
which DNA synthesis takes place in S phase. To ask whether Cizl is coincident with 
sites of replication factories, we compared the position of Cizl speckles to the position 
of PCNA, a component of replication complexes in S phase cells (Fig. 5C). In confocal 
section, PCNA foci are less abundant than Cizl foci, but they are almost all co-incident 
with Cizl (Fig. 5D, E, F). This is particularly striking for foci in the medium size range. 
In merged images, overlap between the positions of PCNA and Cizl foci results in 
yellow spots, while the remaining Cizl foci that are not co-incident with PCNA are red. 
Green (PCNA alone) foci are virtually absent, which suggests that Cizl is present at all 
sites where DNA replication factories have formed. 

Cizl is also present at sites that don't contain PCNA (Fig. 5D), and unlike PCNA, Cizl 
foci persist throughout interphase (Fig. 5 A). One interpretation of these observations is 
that Cizl marks the positions in the nucleus at which PCNA-containing replication 
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factories are able to form in S phase, but that not all of these sites are used at the same 
time. It remains to be determined whether different Cizl foci become active sites of 
DNA replication at different times in S phase, or whether other nuclear activities also 
occur at sites where Cizl is bound. Indeed, at this stage it also remains possible that the 
5 100 JcDa form and the 125 KDa variants of Cizl have different activities, and that they 
reside at nuclear sites with different functions. 

Cizl is essential for cell proliferation So far we have shown that the behaviour of p85 
(pl00)-Cizl correlates with initiation of DNA replication in cell-free assays, that 

10 recombinant Cizl stimulates the frequency of initiation, and that Cizl resides at the 
same nuclear sites as the DNA replication machinery. However, these data do not show 
that Cizl has an essential function in proliferating cells. In order to test this we used 
RNA interference (RNAi) to selectively reduce Cizl transcript levels in NIH3T3 cells. 
Four target sequences within Cizl were chosen (see Fig. 2 A) and short interfering (si) 

15 RNA molecules were produced in vitro. When applied to cells, all four Cizl siRNA's 
restricted growth (Fig. 6 A) and caused a visible reduction in the level of Cizl protein 
after 48 hours (Fig. 6B). The effect of Cizl depletion on proliferation becomes apparent 
between 23 and 40 hours post-transfection, which suggests that the first cell cycle 
without Cizl RNA is relatively unaffected. By 40 hours, controls and Cizl siRNA 
20 treated cells diverged significantly with no further proliferation in the Cizl depleted 
population. To verify the specificity of Cizl depletion, transcript levels were monitored 
at . 24 hours, before proliferation is significantly inhibited (Fig. 6C). At Unk point Cizl 
transcripts were reduced to 42% of the level in control cells treated with GAPDH 
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siRNA. These experiments show , that Cizl is ^required for cell proliferation and are 
consistent with a primary function in DNA replication. - 

To. test this further, cells were pulse-labelled with BrdU 48 hours after siRNA treatment 
to determine the fraction of cells engaged in DNA synthesis (Fig. 6D). When Cizl levels 
were reduced the BrdU labelled fraction was also reduced, suggesting that DNA 
synthesis is inhibited under these conditions. Furthermore, cells in the Cizl depleted 
population that did incorporate BrdU (approximately 15% of the population) were less 
intensely labelled. Therefore, in some Cizl siRNA treated cells S phase is slowed down 
rather than inhibited completely, possibly due to incomplete depletion. 

Inhibition of DNA synthesis by Cizl siRNAs could be a secondary consequence of a 
general disruption of nuclear function. Therefore, we looked in more detail at a range of 
other replication proteins whose levels are regulated in a cell cycle dependant manner, to 
ask whether depleted cells arrest randomly, or accumulate at a particular point. 

During initiation of eukaryotic DNA replication Mem complex proteins assemble at 
replication origins in late Gl, in a Cdc6-dependent maimer. Sometime later, DNA 
polymerases and their accessory factors (including PCNA) become bound to chromatin 
aiid origins are activated. This is associated with nuclear export and proteolysis of the 
majority of Cdc6 and, as DNA synthesis proceeds, gradual displacement of the Mem 
complex from chromatin (Bell and Dutta, 2002). In order to identify the point of action 
of Cizl we used immuno-fluorescence to monitor Mcm3 and PCNA. In Cizl depleted 
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cells (Fig. 6E, F) both proteins were detectable within the nucleus bound to detergent 
resistant nuclear structures. Therefore, these factors are unlikely to bind directly to Cizl, 
or to be dependent upon Cizl for their assembly. In fact, in four independent 
experiments the average number of cells with detergent-resistant chromatin-bound 
Mcm3 actually increased from 31% (+/-6%) to 51% (+/-5%) (Fig. 6E). Increased Mcm3 
indicates that the Cizl dependent step occurs after pre-replication complex assembly 
(but before completion of S phase). In the same cell populations the PCNA positive 
fraction also increased, from 32% (H7-5%) to 49% {+1-6%) (Fig. 6F), narrowing the 
point of Cizl action to after PCNA assembly. Thus, Cizl most likely acts to facilitate 
DNA replication during a late stage in the initiation process, while failure to act inhibits 
progression through S phase, leaving Mcm3 and PCNA in place. 

Taken together, our cell-free and cell-based investigations paint a consistent picture 
about the primary function of Cizl. They suggest that Cizl is a novel component of 
DNA replication factories, and they show that Cizl plays a positive role in the 
mammalian cell-cycle, acting to promote initiation of DNA replication. 

Three of our lines of investigation suggest that Cizl is required during a late stage in the 
initiation process after pre-replication complex formation. First, p85 (pl00)-Cizl antigen 
accumulates in nuclei exposed to cyclin A-cdk2 concentrations that activate DNA 
synthesis, implying that Cizl functions during this step rather than during earlier 
replication complex assembly steps (Coverley et al., 2002). Second, functional studies 
with late Gl nuclei show that recombinant ECizl increases the number of nuclei that 
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incorporate labeled nucleotides in vitro. Therefore, Cizl must be active in a step that 
converts nuclei that are poised to begin DNA synthesis into ones that are actively 
synthesizing DNA. Third, RNA interference studies point to a Cizl -dependent step after 
Mem complex formation and after PCNA has become assembled onto DNA, but before 
5 these proteins are displaced. These distinct lines of investigation lead to strikingly 
similar conclusions about the point of action of Cizl placing it in the later stages of 
initiation. 

Anti-Cizl siRNA as a therapeutic strategy Our analysis shows that Cizl is essential for 
1 0 cell proliferation, and that targeting Cizl is a viable strategy to restrain proliferation. The 
alternatively spliced forms of Cizl that we observe in various cancers (see below) means 
that Cizl could be targeted in a selective way to restrain proliferation in a subset of cells 
within a population. 

15 By way of example, this could be done by targeting siRNA's to the junction sequence 
created in Cizl transcripts when the C-terminal sequence 
GTTGAGGAGGAACTCTGCAAGCAG is missing, in small cell lung carcinoma cells, 
or by using Cizl protein lacking the corresponding VEEELCKQ sequence to select 
specific chemical inhibitors. 



20 



Accordingly the present invention also provides for the use of junction sequences 
created in Cizl transcripts and proteins when alternatively spliced sequences are not 
present, as a diagnostic marker, prognostic indicator or therapeutic target. 
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Embryonic form Cizl is localized to the nucleus RT-PCR analysis across potentially 
variable exons suggest that 3T3 cells predominantly express full-length Cizl, so our 
immuno-localization work on endogenous Cizl (Fig. 5) does not necessarily reflect the 
behavior of ECizl, which lacks several sequence blocks and possibly therefore 
information that is used to localize the protein. To directly compare the localization of 
ECizl and full-length Cizl, enhanced GFP tagged constructs were transfected into 3T3 
cells (fig. 8A), and microinjected into mouse pro-nuclei (fig. 8B). In all cases tagged 
Cizl and ECizl were exclusively nuclear, while a control construct expressing GFP 
alone was present in the nucleus and the cytoplasm. GFP-Cizl and GFP-ECizl were 
both visible in live cells as sub-nuclear foci, similar to replication foci seen in fixed cells 
by immuno-fluorescence. Thus, the three sequence blocks that are absent from ECizl do 
not appear to contribute to the nuclear localization of Cizl. 

Over the three day period following transfection no cell division was observed in the 
GFP-Cizl and GFP-ECizl transfected cells. These data suggest that overexpression of 
functional Cizl has an inhibitory effect on the cell cycle (in cells that have their 
regulatory pathways intact). 

Coalescence When GFP-tagged constructs in which the C-terminal one third of Cizl had 
been removed were transfected into 3T3 cells, differences between ECizl and full length 
Cizl were observed (fig. 8C). By 48 hours FL Cizl N-term(442 equivalent) had 
coalesced into large intra-nuclear blobs which only became apparent in the ECizl N- 
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terni442 transfected population: by day 3 or later. Before this time ECizl N-term442 was 
localised as a nuclear specific but diffuse pattern. Thus ability to coalesce is quantifiably 
different between Cizl and ECizl, and is therefore affected by one of. the three 
alternatively spliced exons (2/3, 6 or 8). 

5 

Like cells transfected with full length Cizl and ECizl, cells transfected with constructs 
in which the C terminal one third was removed were not seen to multiply during the 
three day monitoring period. 

10 C-terminal domains anchor Cizl to nuclear structures As described above, the 
difference between Cizl and ECizl N-term is masked when C-terminal domains are also 
present (fig. 8A). Furthermore the C-terminal . fragment alone directs GFP tag to 
chromatin, forming an irregular pattern that is not as spotty (focal) as Cizl or ECizl, but 
which remains attached to chromosomes during mitosis (fig. 8D). This suggests that C- 

15 terminal domains are involved in immobilizing Cizl on a structural framework in the 
nucleus. Notably, cells transiently transfected with C-terminal fragment continued to 
divide resulting in gradual dilution of green fluorescence. 

Ectopic Ci zl promotes premature entry to S phase We looked at events occurring during 
20 the first day after transfection ? The S phase fraction in transfected cells (green) was 
compared to the S phase fraction in untransfected cells, by labelling with BrdU at 
various intervals. During long labelling windows including 0-22 hours (fig. 8E), 0-12 
hours and 0-7 hours (not shown), consistently more of the Cizl and ECizl transfected 
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cells were engaged in DNA synthesis, compared to untransfected cells. This suggests 
that Cizl and ECizl have a positive effect on the Gl-S transition, promoting 
unscheduled entry to S phase. Similar results were obtained with 3T3 cell populations 
that were densely plated before transfection. This was done in order to minimise the 
5 fraction in the untransfected population that was engaged in S phase as part of the 
normal cell cycle. Under these conditions the difference between the transfected and 
untransfected population was maximised, clearly demonstrating the effect of ectopic 
Cizl on initiation of DNA replication. 

10 Conversely, when cells were labelled with BrdU during a short pulse administered at 22 
hours (fig. 8E), or at 10 hours or 12 hours post-transfection (not shown), the labelled 
fraction was consistently reduced in the Cizl and ECizl transfected populations. This 
suggests that the S phase that is induced by ectopic Cizl or ECizl is abnormal, with 
slow or aborted DNA synthesis that is not sufficient to label cells during short windows 

15 of exposure to BrdU. 

Therefore, ectopic Cizl and ECizl have two effects on S phase in cultured cells. They 
promote DNA replication, but this results in slow or aborted DNA synthesis. 

20 Clones with altered proliferation potential We also monitored transfected populations 
of 3T3 cells over a three week time period. In cells transfected with the GFP-Nterm442 
or the non-alternatively spliced equivalent and maintained under selection with G418, 
laTge foci containing hundreds of cells were observed (fig. 9A). These clusters contained 
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large numbers of GFP expressing cells, demonstrating that over-expression of the N- 
terminal portion of . ECizl (in which replication activity resides) is not lethal, and 
suggesting that over-expression leads to altered proliferation phenotype, compared to 
untransfected cells, including loss of contact inhibition and failure to form a monolayer. 
This Cizl -dependent altered behaviour could contribute to tumour formation. A similar 
truncated version of mouse Cizl, lacking putative chromatin interaction domains was 
previously isolated from a mouse melanoma (fig. 2). 

Human Cizl and cancer 

Cizl cDN As in public databases A s mentioned above human Cizl is alternatively 
spliced at the RNA level to yield transcripts that lack three of the same exons as mouse 
embryonic Cizl. Seven human Cizl cDNAs have been recorded in public databases (fig. 
10), submitted by Mitsui et al (1999), Warder and Keherly (2003) and large-scale 
genome analysis projects (NIH-MGC project, NEDO human cDNA sequencing project). 
Only one is derived from normal adult tissue, and this contains all predicted exons 
(AB030835). The rest are derived from embryonic cells (AK027287), or notably from 
four different types of paediatric cancer (medulloblastoma, AF159025, AF0234161, 
retinoblastoma, AK023978, neuroblastoma, BC004119 and burkitt lymphoma, 
BC021163). The embryonic form and the cancer derived forms lack sequence blocks 
from the same three regions as our embryonic mouse clone, and from a fourth region 
which corresponds to exon 4. Therefore, the limited data suggests that alternatively 
spliced forms are more prevalent early in development This correlation has not 
previously been noted in the scientific literature. The presence of alternatively spliced 
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Cizl in paediatric cancers raises the possibility that Cizl mis-splicing might be linked to 
inappropriate cell proliferation. 

For example, one of the variable exons encodes a short conserved DSSSQ sequence 
5 motif that is absent in mouse ECizl and in a human medulloblastoma. This is directly 
adjacent to the consensus cdk phosphorylation site that we have shown to be involved in 
regulation of ECizl function. Conditional inclusion of the DSSSQ sequence might make 
Cizl the subject of regulation by the ATM/ATR family of protein kinases, which 
phosphorylate proteins at SQ sequences, thereby restraining Cizl initiation function in 
1 0 response to DNA damage. 

Analysis of expressed sequence tags . The presence of alternatively spliced Cizl in 
paediatric cancers prompted a detailed analysis of Cizl ESTs. There are 567 expressed 
sequence tags (ESTs) included in NCBI unigene cluster Hs.23476 (human Cizl). These 
15 are derived from a wide range of normal and diseased tissues and cell lines. Sequences 
have been translated and mapped against the predicted full-length amino-acid sequence 
of human Cizl. Sequence alterations that give rise to amino-acid substitutions, deletions, 
frame-shifts and premature termination of translation have been recorded. 

20 Alternatively spliced Cizl variants were also seen in this EST data set and are recorded 
here. The four sequence blocks that we previously reported to be alternatively spliced in 
human and mouse Cizl (Fig. 2) were observed in the EST sequences, as well as a 
previously undetected variant that lacks the exon 14 derived sequence VEEELCKQ. All 
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of these recurrently variant sequence blocks are bounded by appropriate splice sites. A 
. sixth, variable sequence block was identified in one carcinoma derived library, caused by 
inclusion Q f 

GCCACCGACACCACGAAGAGATGTGTTTGCCCACGTTCCAGTGCAGGGGTG 
5 GAGCACAGCCCGGCTTGTTACAGATAT. 

ESTs are grouped according to the cell type from which they were derived with the 
primary divisions occurring between neoplastic cells of adult, childhood or embryonic 
origin. ESTs from normal tissue of embryonic or adult origin are included for 
10 comparison. EST-derived Cizl protein maps are shown in fig. 11A-E and the 
alternatively spliced exons summarized in fig. 1 IF. 

Three sequence blocks in the N-terminal end of human Cizl are absent in transcripts 
from medulloblastomas and neuroblastoma (fig. 11 A), and occasionally absent from 
15 Cizl transcripts from other cancers, We also found similar alternative splicing in a third 
paediatric cancer, Ewings sarcoma (see below). Paediatric cancer-associated 
alternatively spliced sequences are from exons 2/3 (at least two versions), exon 4 and 
exon 6. 

20 Exon 8 variants in which one or more copies of a Q-rich degenerate repeat are absent 
have been rioted in transcripts derived from normal cells (of embryonic or adult neural 
origin) and from various cancers. Alternative splicing in this region could produce Cizl 
with inappropriate activity, therefore exon 8 variant expression, or occurance of point 
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mutations which influence splicing in this region, might be useful as diagnostic or 
prognostic markers in cancer. The alternatively spliced degenerate repeats in exon 8 are 
detailed below and summarised in fig. 1 IF. 

In the C-terminal half of the human Cizl protein two sequence blocks are variably 
spliced. One of these is missing from transcripts derived from three out of five lung 
carcinoma and lung carcinoid libraries, and from three other carcinoma libraries (but 
very rarely from transcripts from other cell types). 

The second variant sequence block is due to inappropriate inclusion of extra sequence in 
transcripts from the epidermoid carcinoma library (MGC102). 

These sequences and the junction sequences formed in Cizl proteins, and Cizl 
transcripts when these segments are excluded or included, are potential targets for 
selective inhibition of cell proliferation in a wide range of different cancers. The 
remaining non-variant sequences are potential targets for non-selective inhibition of cell 
proliferation. 

In addition to splicing variations, other non-typical Cizl transcripts were found to 
preferentially occur in some cancers. In Rhabdomyosarcomas Cizl is prematurely 
terminated leading to a predicted protein that lacks C-terminal nuclear binding domains. 
This could lead to inappropriate DNA replication and might therefore be a therapeutic 
target or marker in this type of cancer. 
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Several transcripts contain point mutations that lead to amino-acid substitutions in 
putative cyclin-dependent kinase (cdk) phosphorylation sites. In the cervical carcinoma 
library MGC12, this occurs twice. We have shown that two cdk phosphorylation sites 
5 are involved in restraining Cizl activity (fig. 3C and D), implicating these mutations in 
the deregulation of proliferation in cancer cells. One of these is the same as the 
carcinoma-derived mutant mentioned above (fig. HE). Cancer-derived transcripts with 
point mutations in Cizl could also be targeted by RNA interference, or have value as 
diagnostic or prognostic indicators. 

10 

Investigation of Cizl variant expression in paediatric cancers 

Cizl variant expression was investigated in 6 Ewings Sarcoma family tumour cell lines 
(ESFTs) and two neuroblastoma cell lines, using RTPCR with primer sets that span 
three regions of known Cizl variability (fig. 12A). This analysis showed that the pattern 
15 of Cizl variant expression is different in ESFT cells compared to neuroblastoma cells 
compared to non-transformed cells, but apparently very similar within sets of cell lines 
from the same tumour. Therefore, Cizl variant expression could have prognostic or 
diagnostic potential for these cancers. Minor variations within a set of lines from the 
same tumour type could have prognostic value. 

20 

By subcloning and sequencing amplified transcripts we found that all six ESFT lines 
tested express an exon 4 minus form of Cizl. As Cizl is essential for cell proliferation 
(see below), this offers a possible route for selective restraint of ESFT cells. Transcripts 
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from the two neuroblastoma cell lines tested rarely lack exon 4 but frequently lack 
sequences the DSSSQ motif encoded by exon 6 (fig. 12B). 

This experimental analysis confirms that paediatric cancers express forms of Cizl with 
5 variable inclusion of exons 4, 6 and probably exons 2/3. 

Two versions of the sequence encompassing exon 8 and one form of the sequence 
encompassing the VEEELCKQ-coding sequence were detected in ESFTs, 
neuroblastomas and control suggesting that these regions do not contribute to 
1 0 deregulation of Cizl in these paediatric cancers. 

In all cases, Cizl RT-PCR products were most abundant in reactions carried out with 
RNA samples from cancer cell lines, compared to controls (Wi3S, HEK293, NIH3T3 
cells, and primary human osteoblasts). This is consistent with increased expression of 
1 5 Cizl variants in tumours. 

Analysis of C izl protein expression in prostate cancer cell lines 

Normal, non-transformed human lung fibroblasts (and mouse NIH3T3 cells) express two 
major forms of Cizl that are detected by anti-Cizl polyclonal antibody 1793 in western 
20 blots (fig. 13A). The larger (approximately 125kDa) band resolves into three distinct 
bands that are present in equal proportions in Wi38 cells, but grossly uneven proportions 
in prostate cancer cell lines PC3 and LNCAP (and ESFT cell lines -not shown). We 
postulate that these protein isoforms are generated by expression of variably spliced 
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exons. Both tumour cell lines also contain more; Cizl antigen than Wi38 cells, consistent 
with over-expression of Cizl in these cancer cell lines. 

Taken together our results (experimental and bioinformatics analysis of genome data) 
support the conclusion that Cizl is mis-regulated in a wide range of human cancers. We 
have shown that the Cizl protein plays a positive role in the DNA replication process, 
therefore mutant Cizl could contribute to cellular transformation, rather than be a 
consequence of it. If deregulation of Cizl is a common step in this process it represents a 
very attractive target for development of therapeutic agents. 

We have also associated particular changes with specific cancers, making it a real 
possibility that Cizl could be useful as a diagnostic or prognostic marker. 
These include :- 

• Alternative splicing in the N-teiminal part of the protein (that contains 
replication activity in vitro) in paediatric cancers. 

• Point mutations in cyclin-dependent kinase phosphorylation sites known to be 
involved in restraining Cizl replication activity. 

• Non-typical expression and nuclear binding properties of Cizl-pl25 forms in 
prostate carcinoma cell lines, possibly due to mis-regulated splicing of the 
degenerate repeats in exon 8, or other exons. 
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• Conditional exclusion of a discrete motif (VEEELCKQ) in the C-terminal end of 
Cizl (probably involved in localization of Cizl protein within the nucleus) in 
small cell carcinoma of the lung and other carcinomas. 

• Increased levels of Cizl protein and RNA (detected by Western blot and by RT- 
PCR) in all cancer derived cells lines tested so far, compared to Wi38 normal 
embryonic lung fibroblast, human osteoblast RNA and mouse NIH3T3 
fibroblasts. 

The sequences shown in figures 14 to 21 are of use for the development of therapeutic, 
diagnostic, or prognostic reagents. 

Materials and Methods 

Cloning . A lamba triplEx 5 ! -stretch, full length enriched cDNA expression library 
derived from 11 Day old mouse embryos (Clonetech ML5015t) was used to infect E. 
coli XI 1 blue according to the recommended protocol (Clonetech). Plaques were lifted 
onto 0.45 micron nitrocellulose filters pre-soaked in lOmM IPTG (Sigma): Affinity 
purified antibody VI was applied to approximately 3 X 10 6 plaques at 1/1000 dilution in 
PBS, 10% non-fat milk powder, 0.4% Tween20, after blocking for 30 minutes in the 
absence of antibody. After two hours filters were washed three times with the same 
buffer and reactive plaques were visualized with anti-rabbit secondary antibody 
conjugated to horse-radish peroxidase (Sigma), and enhanced chemi-luminescence 
(ECL, Amersham) according to standard procedures. 43 independent plaques were 
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picked but only two strains of phage survived a further three rounds of screening. These 
were converted to pTriplEx by transforming into BM25.8 and sequenced. One codes for 
mouse Cdc6 (clone P) and the other (clone L) for an unknown mouse protein that is 
homologous to human Cizl. We refer to this as embryonic Cizl (ECizl) and it was 
5 submitted to EMBL under the accession number AJ575057. 

Bacterial expression pGEX based bacterial expression constructs (Amersham) were used 
to produce ECizl proteins for in vitro analysis. pGEX-ECizl was generated by inserting 
a 2.3kb Smal-Xbal (blunt ended) fragment from clone L into the Smal site of pGEX-6P- 

10 3. pGEX-Nterm442 was generated by inserting the 1.3 5kb Xmal-Xhol fragment into 
Xmal-Xhol digested pGEX-6P-3, and pGEX-Cterm274 by inserting the 0.95kb Xhol 
fragment into Xhol digested pGEX-6P-3. pGEX-T(191/2)A was generated from pGEX- 
ECizl by site directed mutagenesis (Stratagene Quikchange) using primers 
AACCCCCTCTTCCGCCGCCCCCAATCGCAAGA and 

15 TCITGCGATTGGGGGCGGCGGAAGAGGGGGTT. pGEX-T(293)A was generated 
from pGEX-ECizl using primers AAGCAGACACAGGCCCCGGATCGGCTGCCT 
and AGGCAGCCGATCCGGGGCCTGTGTGTGCTT. Integrity and reading frame of 
all clones were sequence verified. 

20 Recombinant Cizl, Cizl fragments and point mutants were produced in BL21-pLysS 
(Stratagene) as glutathione S-transferase-tagged protein. This was purified from 
sonicated and cleared bacterial lysates by binding to glutathione sepharose 4B 
(Amersham). Recombinant protein was eluted by cleavage from the GST tag using 
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precision protease (as recommended by the manufacturer, Amersham), into buffer 
(50mM Tris-HC pH 7.0, 150mM Nad, ImM DTT). This yielded protein preparations 
between 0.2 and 2.0 mg/ml. For replication assays serial dilutions were made in lOOmM 
Hepes pH 7.8, ImM DTT, 50% glycerol so tiiat not more thanlml of protein solution 
was added to 10ml replication assays, yielding the concentrations shown. Consistent 
with previous observations (Mitsui et al., 1999; Warder and Keherly, 2003) recombinant 
Cizl, and derived fragment N-term442 migrated through SDS-PAGE with anomalously 
high molecular weight. Cyclin A-cdk2 was produced in bacteria as previously described 
(Coverley et al., 2002). 

Anti-Cizl antibodies Rabbit polyclonal antibody VI (Coverley et al., 2000; Stoeber et 
al., 1998; Williams et al., 1998) was raised against an internal fragment of bacteriaHy 
expressed human Cdc6 corresponding to amino-acids 145-360, and affinity purified by 
standard procedures (Harlow and Lane, 1988). This antibody reacts strongly with 
endogenous plOO-Cizl and also with ECizl Nterm442 fragment. Alignment of 
Nterm442 with Cdc6 amino-acids 145-360 suggest that the shared epitope could be at 
294-298 or 304-312 in mouse Cizl. Recombinant Nterm442 was used to generate two 
Cizl-specific polyclonal anti-sera designated 1793 and 1794 (Abeam). 1793 has been 
used routinely in the experiments described here. Its specificity was verified by 
reciprocal immuno-precipitation and western blot analysis with antibody V, by inclusion 
of Nterm 442 (25ug/ml in antibody buffer, lOmg/ml BSA, 0.02% SDS, 0.1% Triton 
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XI 00 in PBS), which blocked reactivity with endogenous epitopes, and by siKNA- 
mediated depletion of Cizl that specifically reduced 1793 nuclear staining. 

Immunop recinitation Asynchronousy growing 3T3 cells were washed in PBS, rinsed in 
extraction buffer (20mM Hepes pH7.8, 5mM potassium acetate, 0.5mM magnesium 
chloride) supplemented with EDTA-free protease inhibitor cocktail (Roche) and scrape 
harvested as for replication extracts. Cells were lysed with 0.1% Triton X 100 and the 
detergent resistant pellet fraction extracted with 0.3M NaCl in extraction buffer. 5[il of 
1793 or 2\xl of antibody V were used per lOOjxl of extract and incubated for 1 hour at 
4°C. Antigen-antibody complexes were extracted with lOOpl of protein G-sepharose 
(Sigma) and beads were washed five times with 50mM Tris pH 7.8, ImM EDTA, 0.1% 
NP40, 150mM NaCl. Complexes were boiled in loading buffer (lOOmM DTT, 2% SDS, 
60mM Tris pH6.8, 0.001% bromophenol blue) and resolved by 6.5% SDS- 
polyacrylamide gel electrophoresis. 

Immuno-fl uorescence Cells were grown on coverslips and fixed in 4% 
paraformaldehyde, with or without brief pre-exposure to 0.05% Triton XI 00 in PBS. 
Endogenous Cizl was detected with 1793 serum diluted 1/2000 in antibody buffer 
following standard procedures. Mcm3 was detected with monoclonal antibody sc9850 
(1/1000), Cdc6 with monoclonal sc9964 (1/100) and PCNA with monoclonal antibody 
PC10 (1/100, all Santa Cruz Biotechnology). Co-localisation analysis of dual stained 
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fluorescent confocal images was earned out as described (Rubbi and Milner, 2000; van 
Steensel et al., 1996). 

Cell synchrony Mouse 3T3 cells were synchronized by release from quiescence as 
previously described (Coverley et al., 2002). Nuclei prepared from cells harvested 17 
hours after release (referred to as C late-GF) were used in all cell-free replication 
experiments described here. This yielded populations containing S phase nuclei, 
replication competent late Gl nuclei and unresponsive early G1/G0 nuclei, in varying 
proportions. Recipient, mid-Gl 3T3 extracts were prepared at 15 hours (these typically 
contain approximately 5% S phase cells). The series of cell-free replication experiments 
described here required large amounts of standardized extract, therefore HeLa cells were 
used because they are easily synchronized in bulk. S phase HeLa extracts were prepared 
from cells released for two hours from two sequential thymidine-induced S phase 
blocks, as described (Krude et aL, 1997). 

Cell-free DNA replication DNA replication assays were performed as described 
(Coverley et aL, 2002; Krude et aL, 1997). Briefly, lOjil of mid Gl or S phase extract 
(supplemented with energy regenerating system, nucleotides and biotinylated dUTP), 
and 5xl0 4 late Gl phase nuclei were incubated for 60mins at 37°C. Reactions were 
supplemented with baculovirus lysate containing cyclin A-cdk2 (Fig.l B and C), where 
0.1 jil of lysate has the same specific activity as InM purified kinase (Coverley et aL, 
2002). All recombinant proteins were serially diluted in lOOmM Hepes pH 7.8, ImM 
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♦ DTT, 50% glycerol^ so that not more. than ljLil.was added to \0\x\ replication assays, 
generating the concentrations indicated. Reactions were stopped with 50jal of 0.5% 
Triton XI 00 and fixed by the addition of 50 of 8% paraformaldehyde, for 5 minutes. 
After transfer to coverslips nuclei were stained with streptavidin-FITC (Amersham) and 
5 counterstained with Toto-3-iodide (Molecular Probes). The proportion of labelled nuclei 
was quantified by inspection at 1000X magnification, and all nuclei with fluorescent foci 
or intense uniform labelling were scored positive. Images of in vitro replicating nuclei 
were generated by confocal microscopy at 600X magnifications, of samples 
counterstained with propidium iodide. For analysis of nuclear proteins, nuclei were re- 
10 isolated after 15 minutes exposure to initiating conditions, by diluting reactions two fold 
with cold PBS and gentle centrifugation. 

Data analysis and presentation Prior to use in initiation assays each preparation of 
synchronized Gl phase nuclei is tested so that the proportion of nuclei that are already in 

15 S phase is established ( C %S'). To do this nuclei are incubated in an extract that is 
incapable of inducing initiation of DNA synthesis (from mid^Gl phase cells harvested 
15 hours after release from quiescence), but that will efficiently support elongation DNA 
synthesis from origins that were initiated in vivo. The elongating fraction of nuclei 
incorporates labeled nucleotides efficiently during in vitro initiation assays but is 

20 uninformative. Routinely this fraction is pre-established and subtracted from the raw 
data. Synchronized populations in which 20% or less are in S phase are used for 
initiation assays. 
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When 3T3 cells are released from quiescence by the protocol used here no more than 
70% of the total population enters S phase (Coverley et al., 2002). However, the highest 
observed replication frequency in vitro is nearer 50%; usually obtained by incubation 
5 with ECizl. For the Gl population of 3T3 nuclei used here 17% were in S phase (%S) 
and the maximum number that replicated in any assay in vitro was 51% (% replication). 
Therefore, 34% of this population is competent to initiate replication in vitro (%Q. 
Thus, for each data point in Figs. 3B-F, % initiation = (% replication -%S)/%C x 100. 

10 RNA interference Endogenous Cizl was targeted in proliferating NIH3T3 cells using in 
vitro transcribed siRNAs (Ambion Silencer kit), directed against four regions of mouse 
Cizl. Oligonucleotide sequences that were used to generate siRNAs are 
AAGCACAGTCACAGGAGCAGACCTGT CTC and 

AATCTGCTCCTGTGACTGTGCCCTGTCTC for siRNA 4, AATCTGTCAC 

15 AAGTTCTACGACCTGTCTC and AATCGTAGAACTTGTGACAGACCTGTCTC 
for siRNA 8, AATCGCAAGG ATTCTTCTTCTCCTGTCTC and AAAGAAGAAGAA 
TCCTTGCGACCTGTCTC for siRNA 9, and 

AATCTGCAGCAGTTCTTTCCCCCTGTCTC and 
AAGGGAAAGAACTGCTGCAGACCTGTCTC for siRNA 11. Target sequences that 

20 are distributed throughout the Cizl transcript were chosen based on low secondary 
structure predictions and on location within exons that are consistently expressed in all 
known forms of Cizl (sequences 4, 8, 1 1), with the exception of one (siRNA 9) that is 
known to be alternatively spliced. Negative controls were untreated, mock treated . 
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(transfection reagents but no siRNA) and cells treated with GAPDH siRNA (Ambion). 
Cy3 labelled siRNAs (Ambion) were used to estimate transfection efficiency, which was 
found to be greater than 95%. RNA interference experiments were performed in 24 well 
format starting with 2xl0 4 cells per well in 500jil of medium (DMEM with glutamax 
5 supplemented with 4% PCS). siRNA's were added 12 hours after plating using 
oligofectamine reagent for delivery (Invitrogen). Unless stated otherwise, siRNAs were 
used in pairs (at 2nM total concentration in medium), as two doses with the second dose 
delivered in fresh medium 24 hours after the first. Results were assessed at 48 hours 
after first exposure, by counting cell number, S phase labelling, and immuno-staining. 

10 Northern blots were performed on RNAs isolated from cells treated for 24 hours with a 
single dose of siRNA, in reactions that were scaled up 5 fold. RNA was prepared using 
Trizol Reagent (Invitrogen) and samples were electrophoresed through 1% agarose, 
transferred onto Hybond N+ nylon membrane (Amersham), and sequentially hybridised 
at 50°C with cDNA probes using NorthernMax kit reagents (Ambion), following 

15 manufacturers instructions. The membrane was stripped between each hybridisation 
using 0.5% SDS solution at 90°C, allowed to cool slowly to room temperature. Probes 
were [ 32 P]-dCTP labelled using Random Primers DNA labelling system (Gibco BRL), 
and used in the following order: i. A 1.35kb Xmal-Xhol fragment derived from ECizL 
ii. Human 0-actin cDNA (Clontech) and iii. Mouse GAPDH cDNA (RNWAY 

20 laboratories). The membrane was washed twice in 2X SSC 0.2% SDS for 30-60 mins 
each, followed by one wash in 0.2X SSC 0.2% SDS for 30 mins, at 55-65°C, depending 
on probe used. Hybridisation signals were quantified using an Amersham Biosciences 
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Typhoon 9410 variable mode imager, and Image Quant TL software (v2002). Band 
intensities are expressed in arbitrary units (in parentheses), and results for Cizl and 
GAPDH were normalised against those for (3-actin, and expressed as a %. 

5 S phase labelling The fraction of nuclei undergoing DNA synthesis in vivo was 
monitored by supplementing culture medium with 20 jxM bromodeoxyuridine (BrdU, 
Sigma) for 20 minutes. Incorporated BrdU was visualized after acid treatment with 
FITC-conjugated anti-BrdU monoclonal antibody (Alexis Biochemicals) according to 
manufacturers instructions. Nuclei were counterstained with Hoescht 33258 and scored 
1 0 under high ( 1 00OX) magnification. 

Green fluorescent protein tagged Cizl 

Full-length mouse Cizl cDNA was obtained from UK HGMP Resource Centre (MGC 
clone 27988) and the sequence fully verified. A 2.8kb Smal-Xbal (blunt ended) full 

15 length Cizl fragment from this clone, and a 2.3kb Smal-Xbal (blunt ended) ECizl 
fragment from pTriplEx-clone L were ligated in frame with enhanced green fluorescent 
protein (EGFP) into the Smal site of pEGFP-C3 (Clontech). pEGFP-C3 with no insert 
was used as a control. Constructs were transfected into NIH3T3 cells using TransTT-293 
(Minis), following manufacturers instructions or microinjected into the male pro-nucleus 

20 of fertilized mouse eggs at the one cell stage. Growing 3T3 cells transfected with full 
length EGFP-Cizl, or EGFP-ECizl were analysed by live cell fluorescent microscopy 
up to three days after transfection. DNA synthesis was monitored during the first 24 
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hours after transfection, by including the nucleotide analogue BrdU in cell culture 
medium for various time periods as indicated in figure legends. As described above any 
cells undergoing DNA synthesis while exposed to BrdU stain with anti- BrdU 
monoclonal antibody generating red nuclei. 
5 Cizl transfected cells were also maintained under selection with 50 ng/ml G418, in 
standard culture medium (DMEM Glutamax plus 10% fetal calf serum) for up to a 
month, yielding cell populations with. altered morphology. 

EST sequence analysis 

10 Individual expressed sequence tags (ESTs) mapping to NCBI unigene cluster Hs.23476 
(human Cizl) were translated using Genejockey and the predicted amino-acid sequence 
compared to the predicted sequence for foil length Cizl, with the aim of identifying 
recurrent changes in cancer cells, in order to exclude errors that reflect poor quality 
DNA sequence such as that which occurs at the end of long sequencing runs, only those 
15 changes positioned more than 8 amino-acids from the end of uninterrupted sequence are 
included in this analysis. Frame-shifts that are restored by a second alteration later in the 
read, and frame-shifts that are followed by a stop codon are only included if followed by 
uninterrupted sequence. Thus the majority, of sequencing errors are excluded from this 
analysis. However, it is expected that many of the point mutations that remain (including 
20 frame-shifts and stops) reflect errors introduced during sequencing. Therefore, this 
analysis is aimed at uncovering trends, with weight being given to point mutations only 
if they appear more than once. 
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Of 567 sequences, that map to Cizl unigene cluster we have analysed m6st (all 
paediatric cancers, prostate and lung carcinomas, leukemias and lymphomas and a wide 
range of non-diseased tissues). Some were not mapped because they are extremely short 
5 reads or yielded very short amino-acid sequences upon translation, and for a small 
number we detected no homology to the Cizl coding sequence. A small number of ESTs 
were excluded from the analysis because of multiple frameshifts that produced stretches 
of homology in all three frames, with no indication of the reading frame used in vivo. 
These were all from cancer derived material, usually adenocarcinomas. 

10 

RT-PCR analysis of Cizl isoform expression RNA was isolated using trizol reagent 

following recommended procedures, DNAse treated and reverse transcribed using 

random hexamers and superscript n, then amplified with Cizl specific primers: - 

h/m5 CAGTCCCCACCACAGGCC, 
1 5 h/m2 GGCTTCCTCAGACCCCTCTG. 

H/m3 ACACAGACCTCTCCAGAGCACTTAG 

H/m4 ATGGTGACCTTCAGGGAGC 

H4 TCCTTGGCGA TGTCCTCTGG GCAGG 

H3 TCCCTCCTCA ACGGCTCCAT GCTGC 
20 H6 CG TGGGGGCGAC TTGAGCGTTG AGG 

HI GATGCCAGGGGT ATGGGGCGCC GGG 
H2 TCCGAGCCCT TCCACTCCTC TCTGG 
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Analysis of Cizl protein isofoims in cancer cell lines 

Cells were grown in DMEM with 10% FCS until sub-confluent, rinsed in cold hepes 
buffered saline supplemented with EDTA free protease inhibitor cocktail (Roche) then 
scrape harvested and supplemented with 0.1% Triton XI 00. Detergent-insoluble 
5 material (including nuclei) was pelleted by gentle centrifiigation to yield supernatant 
(SN) and pellet fractions (P). These were boiled in reducing SDS-PAGE sample buffer 
and proteins resolved by electrophoresis through 8% SDS-PAGE. After transfer to 
nitrocellulose, Cizl isofonns were detected with anti-Cizl antibody 1793). All methods 
used in this analysis are well documented elsewhere. 

10 
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Claims 

L Use of a Ciz 1 nucleotide. or polypeptide sequence, or any fragment or variant 
thereof as a target for the identification of agents which modulate DNA replication. 

2. A screening method for the identification of agents which modulate DNA 
replication wherein the screening method comprises the use of Cizl nucleotide or 
polypeptide sequence or any fragment or variant thereof. 



10 3. The screening method according to claim 2 wherein said method comprises 
detecting or measuring the effect of an agent on a nucleic acid molecule selected from 
the groups consisting of: 

a) a nucleic acid molecule comprising a nucleic acid sequence represented in any 
of Figures 14, 15, or21; 
15 b) a nucleic acid molecule which hybridises to the nucleic acid sequence in (a) 

and which has Cizl activity or activity of a variant thereof; 

c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate 
because of the genetic code to the sequences in a) and b) and a candidate agent to 
be tested; 

20 d) a nucleic acid molecule derived from the genomic sequence at the Cizl locus 

or. a nucleic acid molecule that hybridises to the genomic sequence. 
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4. The method according to claim 3 wherein said nucleic acid molecule is modified 
by deletion, substitution or addition of at least one nucleic acid residue of the nucleic 
acid sequence. 

5. The screening method according to claim 2, wherein said method comprises one 
or more of the following steps: 

(i) forming a preparation comprising a polypeptide molecule, or an active fragment 
thereof, encoded by a nucleic acid molecule selected from the group consisting of: 

a) a nucleic acid molecule comprising a nucleic acid sequence represented in any 
of Figures 14, 15, or 21; 

b) a nucleic acid molecule which hybridizes to the nucleic acid sequence in (a) 
and which has Cizl activity or activity of a variant thereof; 

c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate 
because of the genetic code to the sequences in a) and b) and a candidate agent to 
be tested; 

d) a nucleic acid molecule derived from the genomic sequence at the Cizl locus 
or a nucleic acid molecule that hybridises to the genomic sequence; and 

(ii) detecting or measuring the effect of the agent on the activity of said polypeptide. 

6. The method according to claim 5 wherein said polypeptide is modified by 
deletion, substitution or addition of at least one amino acid residue of the polypeptide 
sequence. 



WO 2004/051269 




•CT/GB2003/005334 



7. The method according to aiiy of claims 3 to 6 wherein said screening method is a 
cell-based screening method. 

8. The method according to claim 7 wherein the cell naturally expresses the Cizl 
5 polypeptide. 

9. The method according to claim 7 wherein the cell is transfected with a nucleic 
acid molecule encoding Ciz 1 or a fragment or variant thereof. 

10 10. An agent identified by the method of any of claims 1 to 9. 

11. An agent according to claim 10 wherein said agent is an antagonist of Cizl 
mediated DNA replication. 



15 12. An agent according to claim 10 wherein said agent is an agonist of Cizl 
mediated DNA replication. 

13. An agent according to any of claims 10 to 12 wherein said agent is selected from 
the group consisting of: polypeptide or nucleic acid probe; polypeptide; peptide; 
20 aptamer; chemical; antibody; nucleic acid. 



13. An agent according to claim 12 wherein said agent is an antibody molecule and 
binds to any of the sequences represented by Figures i6, 17, or 20. 
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14. An agent according to claim 12 wherein said agent is an anti-sense nucleic acid 
molecule or RNAi which binds to and thereby blocks or inactivates the mRNA sequence 
of Cizl or any variant thereof. 

5 

15. An agent according to claim 14 wherein said agent binds to any part of the 
sequences illustrated in Figures 14, 15, or 21 or in part (i) b-d of claim 3. 

16. A vector as a delivery means for delivering an antisense or an RNAi molecule to 
10 a cell. 

17. A vector according to claim 16 wherein the vector includes an expression 
cassette comprising the nucleotide sequence selected from the group consisting of; 

a) the nucleic acid sequence which encodes Cizl amino acid sequence as shown 
15 in Figs 14, 15, and 21. 

b) a nucleic acid molecule which hybridizes to the nucleic acid sequence of (a) ; 

c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate 
because of the genetic code to the sequences in a) and b) and any sequence which 
is complimentary to any of the above sequences; 

20 d) a nucleic acid sequence that encodes Cizl pre-mRNA (i.e., the genomic 

sequence), 
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18. A vector according to claim 17 wherein the expression cassette is 
transcriptionally linked to a promoter sequence. 

19. A diagnostic method for the identification of proliferative disorders comprising 
5 detecting the presence or expression of the Ciz 1 gene, Cizl splice variants and 

mutations in the genomic or protein sequence thereof. 

20. A diagnostic method according to claim 19 wherein said method comprises one 
of more of the following steps: 

10 (i) contacting a sample isolated from a subject to be tested with an agent which 

specifically binds a polypeptide with Ciz 1 activity or a nucleic acid molecule 
encoding a polypeptide with Ciz 1 activity; and 
(ii) detecting or measuring the binding of the agent on said polypeptide or 
nucleic acid in said sample; 
15 (iii) use of reverse-transcribed PCR or real-time PGR to monitor Cizl and Cizl 

isoform expression and to measure expression levels, 
(iv) measuring the presence of nucleic acid or amino-acid mutations based on 
altered conformational properties of the molecule. 

20 21. Use of an agent identified by the method of any of claims 1 to 9 in association 
with a pharmaceutically acceptable carrier, excipient or diluent, as a pharmaceutical. 
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22. Use of an agent identified by the method of any of claims 1 to 9 for the 
manufacture of a medicament for use in the treatment of proliferative disease. 

23. Use according to claim 22 wherein said proliferative disease is cancer. 

24. Use according to claim 23 wherein said cancer is a paediatric cancer and is 
selected from the group consisting of; retinoblastoma, neuroblastoma, Burkitt 
lymphoma, medulloblastoma, Ewings Sarcoma family tumours (ESFTs), 

25. Use according to claim 23 wherein the cancer is a carcinoma, adenocarcinoma, 
lymphoma or leukemia. 

26. Use according to claim 22 wherein the disease is liver, lung or skin cancer or 
metastasis. 

27. A method to treat a proliferative disease comprising administering to an animal, 
an agent identified by the method of any of claims 1 to 9. 

28. Use of an agent identified by the method of any of claims 1 to 9 for the 
manufacture of a medicament to slow cell division or growth. 

29. A kit comprising a diagnostic, prognostic or therapeutic agent identified by the 
method of any of claims 1 to 9. 
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CATGTTCAAC CCGCAACTCC AGCAGCAGCA ACAGTTGCAG CAGCAGCAGC >Tj 
AACAGTTGCA GCAGCAGCTC CAGCAGCAGC AGCTCCAGCA GCAGCAACAG q£ " 

CAGATACTGC AGCTCCAACA GCTGCTGCAA CAGTCCCCAC CACAGGCCTC & 
CTTGTCCATT CCTGTCAGCC GGGGCCTCCC CCAGCAGTCA TCCCCGCAAC o> 
AGCTTCTGAG TCTCCAGGGC CTCCACTCGA CCTCCCTGCT CAATGGCCCC ^ 
ATGCTGCAAA GAGCTTTGCT CCTACAGCAG TTGCAAGGAC TGGACCAGTT -Pu 
TGCAATGCCA CCAGCCACGT ATGACGGTGC CAGCCTCACC ATGCCTACGG 
CAACACTGGG TAACCTCCGT GCTTTCAATG TGACAGCCCC AAGCCTAGCA 
GCTCCCAGCC TTACACCACC CCAGATGGTC ACCCCAAATC TGCAGCAGTT 
CTTTCCCCAG GCTACTCGAC AGTCTCTGCT GGGGCCTCCT CCTGTTGGGG 
TCCCAATAAA CCCTTCTCAG CTCAACCACT CAGGGAGGAA CACCCAGAAA 
CAGGCCAGAA CCCCCTCITC CACCACCCCC AATCGCAAGG ATTCTTCTTC 
TCAGACGGTG CCTCTGGAAG ACAGGGAAGA CCCCACAGAG GGGTCTGAGG 
AAGCCACGGA GCTCCAGATG GACACATGTG AAGACCAAGA TTCACTAGTC 
GGTCCAGATA GCATGCTGAG TGAGCCCCAA GTGCCTGAGC CTGAGCCCTT 
TGAGACATTG GAACCACCAG CCAAGAGGTG CAGGAGCTCA GAGGAGTCCA 
CCGAGAAAGG CCCTACAGGG CAGCCACAAG CAAGGGTCCA GCCTCAGACC 
CAGATGACAG CACCAAAGCA GACACAGACC CCGGATCGGC TGCCTGAGCC 
ACCAGAAGTC CAAATGCTGC CGCGTATCCA GCCACAGGCA CTGCAGATCC 
AGACCCAGCC AAAGCTGCTG AGGCAGGCAC AGACACAGAC CTCTCCAGAG 
CACTTAGCGC CCCAGCAGGA TCAGGTAGAG CCACAGGTAC CATCACAGCC 
CCCATGGCAG TTGCAGCCAC GGGAGACAGA CCCACCGAAC CAAGCTCAGG 
CACAGACCCA GCCTCAGCCC CTCTGGCAGG CGCAGTCACA GAAGCAGGCC 
CAGACACAGG CACATCCACA GGTACCCACC CAAGCACAGT CACAGGAGCA 
GACATCAGAG AAGACCCAGG ACCAGCCTCA GACCTGGCCA CAGGGGTCAG 
TACCCCCACC AGAACAAGCG TCAGGTCCAG CCTGTGCCAC GGAACCACAG 
CTATCCTCTC ACGCTGCAGA AGCTGGGAGT GACCCAGACA AGGCCTTGCC 
AGAACCAGTA AGTGCCCAGA GCAGTGAAGACAGGAGCCGG GAGGCGTCCG 
CTGGTGGCCT GGATTTGGGA GAATGTGAAA AGAGAGCGGG AGAGATGCTG 
GGGATGTGGG GGGCTGGGAG CTCCCTGAAG GTCACCATCC TGCAGAGTAG 
CAACAGCCGG GCCTTTAACA CCACACCCCT CACATCTGGA CCTCGCCCTG 
GGGACTCTAC CTCTGCCACC CCTGCCATTG CCAGCACACC CTCCAAGCAA 
AGCCTCCAGT TCTTCTGCTA CATCTGCAAG GCCAGCAGCA GCAGCCAGCA 
GGAGTTCCAG GATCACATGT CAGAGGCTCA GCACCAACAG CGGCTTGGGG 
AAATACAACA CTCGAGCCAG ACCTGCCTGC TGTCCCTGCT GCCCATGCCT 
CGGGACATCC TGGAGAAAGA AGCGGAAGAT CCTCCGCCCA AACGCTGGTG 
CAACACCTGC CAGGTGTACT ACGTGGGAGA CTTGATCCAG CACCGTAGGA 
CACAGGAGCA CAAGGTTGCC AAACAATCCC TGAGGCCCTT CTGCACCATA 
TGCAACCGTT ACTTCAAGAC CCCTCGAAAG TTTGTGGAGC ACGTGAAGTC 
CCAGGGACAC AAGGACAAGG CCCAAGAGCT GAAGACACTTGAAAAGGAGA 
CAGGCAGCCC AGATGAGGAC CACTTCATCA CTGTGGACGC CGTCGGTTGC 
TTTGAGAGTG GTCAAGAAG A GGACGAGGAT GACGACGAGGAAGAAGAAGA 
AGAAGGAGAG ATTGAGGCTG AGGAGGAATT CTGCAAGCAG GTGAAGCCGA 
G AGAAACATC CTCAGAGCAA GGGAAGGGCT CTG AGACGTA CAACCCCAAC 
ACAGCCTATG GTGAGGATTT CCTGGTGCCA GTGATGGGCT ATGTCTGTCA 
AATCTGTCAC AAGTTCTACG ACAGCAACTC AGAATTGCGG CTTTCTCACT 
GCAAGTCCCT GGCCCACTTT GAGAACCTGC AGAAATACAA AGCCAAGAAC 
CCAAGCCCTC CTCCTACCCG GCCTGTGAGC CGCAAGTGTG CCATCAACGC 
CCGCAACGCC CTGACTGCAC TGITCACCTC TAGCCACCAG CCCAGCCCCC 
AGGACACAGT GAAAATGCCC AGCAAGGTGA AGCCTGGATC CCCCGGACTC 
CCTCCTCCCC TTCGGCGCTC AACACGCCTC AAAACCTGAT AGAGGGAGCT 
CTGGCCACTC AGCCTGACTA AGGCTCAGTC TGCTAATGCT TCCTAGGTAT 
CTGTGTAGAA ATGTTCAAGT GGTTGGTGTT TTTACTCAAA ATCCAATAAA 
GAGTCAGTAG TTTGGCAAAA AAAAAAAAAA AAAAAAA 
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TGGGGGCTGC GGGGCCGGCC CATCCGTGGG GGCGACTTGA GCGTTGAGGG 
CGGGCGGGGA GGCGAGCCAC CATGTTCAGC CAGCAGCAGC AGCAGCTCCA 
GCAACAGCAG CAGCAGCTCC AGCAGTTACA GCAGCAGCAG CTCCAGCAGC 
AGCAATTGCA GCAGCAGCAG TTACTGCAGC TCCAGCAGCT GCTCCAGCAG 
TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAG GGCACC AACTCAGCCT 
CCCTCCTCAA CGGCTCCATG CTGCAGAGAG CTTTGCTTTT AC AG C AGTTG 
CAAGGACTGG ACCAGTTTGC AATGCCACCA GCCACGTATG ACACTGCCGG 
TCTCACCATG CCC ACAGCAA CACTGGGTAA CCTCCG AGGC TATGGCATGG 
-CATCCCCAGG CCTCGCAGCC CCCAGCCTCA CACCCCCACA ACTGGCCACT 
CCAAATTTGC AACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCTTGCTGGG 
ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 
GACGGAACCC CCAGAAACAG GCCCGGACCt CCTCCTCTAC CACCCCCAAT 
CGAAAGGATT CTTCTTCTCA GACAATGCCT GTGGAAGACA AGTCAGACCC 
CCCAGAGGGG TCTGAGGAAG CCGCAGAGCC CCGGATGGAC ACACCAGAAG 
ACCAAGATTT ACCGCCCTGC CCAGAGGACA TCGCCAAGGA AAAACGCACT 
CCAGCACCTG AGCCTGAGCC TTGTGAGGCG TCCGAGCTGC CAGCAAAGAG 
ATTGAGGAGC TCAGAAGAGC CCACAGAGAA GGAACCTCCA GGGCAGTTAC 
AGGTGAAGGC CCAGCCGCAG GCCCGGATGA CAGTACCGAA ACAGACACAG 
ACACCAGACC TGCTGCCTGA GGCCCTGGAA GCCCAAGTGC TGCCACGATT 
CCAGCCACGG GTCCTGCAGG TCCAGGCCCA GGTGCAGTCA CAGACTCAGC 
CGCGGATACC ATCCAC AGAC ACCCAGGTGC AGCCAAAGCT GCAGAAGCAG 
GCGCAAACAC AGACCTCTCC AGAGCACTTA GTGCTGCAAC AGAAGCAGGT 
GC AGCCACAG CTGCAGCAGG AGGCAGAGCC ACAGAAGCAGGTGCAGCCAC 
AGGTACAGCC ACAGGCACAT TCACAGGGCC CAAGGCAGGT GCAGCTGCAG 
CAGGAGGCAG AGCCGCTGAA GCAGGTGCAG CCACAGGTGC AGCCCCAGGC 
ACATTCACAG CCCCCAAGGC AGGTGCAGCT GCAGCTGCAG AAGCAGGTCC 
AGACACAGAC ATATCCACAG GTCCACACAC AGGCACAGCC AAGCGTCCAG 
CCAC AGGAGC ATCCTCCAGC GCAGGTGTC A GTACAGCCAC CAGAGCAGAC 
CCATGAGCAG CCTCACACCC AGCCGCAGGT GTCGTTGCTG GCTCCAGAGC 
AAACACCAGT TGTGGTTCAT GTCTGCGGGC TGGAGATGCC ACCTGATGCA 
GTAGAAGCTG GTGGAGGCAT GGAAAAGACC TTGCCAGAGC CTGTGGGCAC 
CCAAGTCAGC ATGGAAGAGA TTCAGAATGA GTCGGCCTGT GGCCTAGATG 
TGGGAGAATG TGAAAACAGA GCGAGAGAGA TGCCAGGGGTATGGGGCGCC 
GGGGGCTCCC TGAAGGTCAC CATTCTG C AG AG CAGTGAC A GCCGGGCCTT 
TAGCACTGTA CCCCTGACAC CTGTCCCCCG CCCCAGTGAC TCCGTCTCCT 
CCACCCCTGC GGCTACCAGC ACTCCCTCTA AGCAGGCCCT CCAGTTCTTC 
TGCTACATCT GCAAGGCCAG CTGCTCCAGC CAGCAGGAGT TCCAGGACCA 
CATGTCGGAG CCTCAGCACC AGCAGCGGCT AGGGGAGATC CAGCACATGA 
GCCAAGCCTG CCTCCTGTCC CTGCTGCCCG TGCCCCGGGA CGTCCTGGAG 
ACAGAGGATG AGGAGCCTCC ACCAAGGCGC TGGTGCAACA CCTGCCAGCT 
CTACTACATG GGGGACCTGA TCCAACACCiG CAGGACACAG GACCACAAGA 
TTGCCAAACA ATCCTTGCGA CCCTTCTGCA CCGTTTGCAA CCGCTACTTC 
AAAACCCCTC GCAAGTTTGT GGAGCACGTG AAGTCCCAGG GGCATAAGGA 
CAAAGCCAAG GAGCTGAAGT CGCTTGAGAA AGAAATTGCT GGCCAAGATG 
AGGACCACTT CATTACAGTG GACGCTGTGG GTTGCITCGA GGGTGATGAA 
GAAGAGGAAG AGGATGATGA GGATGAAGAAGAGATCGAGGTTGAGGAGGA 
ACTCTGCAAG CAGGTGAGGT CCAGAGATAT ATCCAGAGAG GAGTGGAAGG 
GCTCGGAGAC CTACAGCCCC AAT ACTG CAT ATGGTGTGG A CTTCCTGGTG 
CCCGTGATGG GCTATATCTG CCGCATCTGC CACAAGTTCT ATCAC AG CAA 
CTCAGGGGCA CAGCTCTCCC ACTGCAAGTC CCTGGGCCAC TTTGAGAACC 
TGCAGAAATA CAAGGCGGCC AAGAACCCCA GCCCCACCAC CCGACCTGTG 
AGCCGCCGGT GCGCAATCAA CGCCCGGAAC GCTTTGACAG CCCTGTTCAC 
CTCCAGCGGC CGCCCACCCT CCCAGCCCAA CACCCAGGAC AAAACACCCA 
GCAAGGTGAC GGCTCGACCC TCCCAGCCCC CACTACCTCG GCGCTCAACC 
CGCCTCAAAA CCTGATAGAG GGACCTCCCT GTCCCTGGCC TGCCTGGGTC 
CAG ATCTGC T AATGCTTTTT AGGAGTCTGC CTGGAAACTT TGACATGGTT 
CATGTTTTTA CTCAAAATCC AATAAAACAA GGTAGTTTGG CTGTGCAAAA 
AAAAAAAAAA AAAAAAAAAA AA 
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Part of exons 2/3 absent 

TGGGGGCTGC GGGGCCGGCC CATCCGTGGG GGCGACTTGA GCGTTGAGGG 
CGCGCGGGGA GGCGAGCCAC CATGTTCAGC CAGCAGCAGC AGCAGCTCCA 
GCAACAGCAG GGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAGGGCACC AACTCAGCCT 
CGCTCCTCAA CGGCTCCATG CTGCAGAGAG CTTTGCTTTT ACAGCAGTTG 
CAAGGACTGG ACCAGTTTGC AATGCCACCA GCCACGTATG ACACTGCCGG 
TCTCACCATG CCCACAGCAA CACTGGGTAA CCTCCGAGGC TATGGCATGG ; 
CATCCCCAGG CCTCGCAGCC CCCAGCCTCA CACCCCCACA ACTGGCCACT 
CCAAATTTGC AACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCTTGCTGGG 
ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 
GACGGAACCC CCAGAAACAG GCCCGGACCT CCTCCTCTAC CACCCCCAAT . : 
CGAAAGGATT CTTCTTCTCA GACAATGCCT GTGGAAGACA AGTCAGACCC 
CCCAGAGGGG TCTGAGGAAG CCGCAGAGCC CCGGATGGAC ACACCAGAAG 
ACCAAGATTT ACCGCCCTGC CCAGAGGACA TCGCCAAGGA AAAACGCACT 
CCAGCACCTG AGCCTGAGCC TTGTGAGGCG TCCGAGCTGC CAGCAAAGAG 
ATTGAGGAGC TCAGAAGAGC CCACAGAGAA GGAACCTCCA GGGCAGTTAC 
AGGTGAAGGC CCAGCCGCAG GCCCGGATGA CAGTACCGAA ACAGACACAG 
ACACCAGACC TGCTGCCTGA GGCCCTGGAA GCCCAAGTGC TGCCACGATT 
CCAGCCACGG GTCCTGC AG G TCCAGGCCCA GGTGCAGTCA CAGACTCAGC 
CGCGGATACC ATCCACAGAC ACCC AGGTGC AGCCAAAGCT GCAGAAGCAG 
GCGCAAACAC AGACCTCTCC AGAGCACTTA GTGCTGCAAC AGAAGCAGGT 
GCAGCCACAG CTGCAGCAGG AGGCAGAGCC ACAGAAGCAG GTGCAGCCAC 
AGGTACAGCC ACAGGCACAT TCACAGGGCC GAAGGCAGGT GCAGCTGCAG 
CAGGAGGCAG AGCCGCTGAA GCAGGTGCAG CCACAGGTGC AGCCCCAGGC 
ACATTCACAG CCCCCAAGGC AGGTGCAGCT GCAGCTGCAG AAGCAGGTCC 
AGACACAGAC ATATCCACAG GTCCACACAC AGGCACAGCC AAGCGTCCAG 
CCACAGGAGC ATCCTCCAGC GCAGGTGTCA GTACAGCCAC CAGAGCAGAC 
CCATGAGCAG CCTCACACCC AGCCGCAGGT GTCGTTGCTG GCTCCAGAGC 
AAACACCAGT TGTGGTTCAT GTCTGCGGGC TGGAGATGCC ACCTGATGCA 
GTAGAAGCTG GTGGAGGCAT GGAAAAGACC TTGCCAGAGC CTGTGGGCAC 
CCAAGTCAGC ATGGAAGAGA TTCAGAATGA GTCGGCCTGT GGCCTAGATG 
TGGGAGAATG TGAAAACAGA GCGAGAGAGA TGCCAGGGGT ATGGGGCGCC 
GGGGGCTCCC TGAAGGTCAC CATTCTGCAG AGCAGTGACA GCCGGGCCTT 
TAGCACTGTA CCCCTGACAC CTGTCCCCCG CCCCAGTGAC TCCGTCTCCT 
CCACCCCTGC GGCTACCAGC ACTCCCTCTA AGCAGGCCCT CCAGTTCTTC 
TGCTACATCT GCAAGGCCAG CTGCTCCAGC CAGCAGGAGT TCCAGGACCA 
CATGTCGGAG CCTCAGCACC AGCAGCGGCT AGGGGAGATC CAGCACATGA 
GCCAAGCCTG CCTCCTGTCC CTGCTGCCCG TGCCCCGGGA CGTCCTGGAG 
ACAGAGGATG AGGAGCCTCC ACCAAGGCGC TGGTGCAACA CCTGCCAGCT 
CTACTACATG GGGGACCTG A TCC AAC ACCG CAGGACACAG GACCACAAGA 
TTGCCAAACA ATCCTTGCGA CCCTTCTGCA CCGTTTGCAA CCGCTACTTC 
AAAACCCCTC GCAAGTTTGT GGAGCACGTG AAGTCCCAGG GGCATAAGGA 
CAAAGCCAAG GAGCTGAAGT CGCTTGAGAA AGAAATTGCT GGCCAAGATG 
AGGACCACTT CATTACAGTG GACGCTGTGG GTTGCTTCGA GGGTGATGAA 
GAAGAGGAAG AGGATGATGA GGATGAAGAA GAGATCGAGG TTGAGGAGGA 
ACTCTGCAAG CAGGTGAGGT CCAGAGATAT ATCCAGAGAG GAGTGGAAGG 
GCTCGGAGAC CTACAGCCCC AATACTGCAT ATGGTGTGGA CTTCCTGGTG 
CCCGTGATGG GCTATATCTG CCGCATCTGC CACAAGTTCT ATCACAGCAA 
CTCAGGGGCA CAGCTCTCCC ACTGCAAGTC CCTGGGCCAC TTTGAGAACC 
TGCAGAAATA CAAGGCGGCC AAGAACCCCA GCCCCACCAC CCGACCTGTG 
AGCCGCCGGT GCGCAATCAA CGCCCGGAAC GCTTTGACAG CCCTGTTCAC 
CTCCAGCGGC CGCCCACCCT CCCAGCCCAA CACCCAGGAC AAAACACCCA 
GCAAGGTGAC GGCTCGACCC TCCCAGCCCC CACTACCTCG GCGCTCAACC 
CGCCTCAAAA CCTGATAGAG GGACCTCCCT GTCCCTGGCC TGCCTGGGTC 
CAGATCTGCT AATGCTTTTT AGGAGTCTGC . CTGGAA ACTT TGACATGGTT 
CATGTTTTTA CTCAAAATCC AATAAAACAA GGTAGTTTGG CTGTGCAAAA 
AAAAAAAAAA AAAAAAAAAA AA 
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Exon 4 absent 

TGGGGGCTGC GGGGCCGGCC CATCCGTGGG GGCGACTTGA GCGTTGAGGG 
CGCGCGGGGA GGCGAGCCAC CATGTTCAGC CAGCAGCAGC AGCAGCTCCA 
GCAACAGCAG CAGCAGCTCC AGCAGTTACA GGAGCAGCAG CTCCAGCAGC 
AGCAATTGCA GCAGCAGCAG TTACTGCAGC TCCAGCAGCT GCTCCAGCAG 
TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAGGGCACC AACTCAGCCT 
CCCTCCTCAA CGGCTCCATG CTGCAGAGAG CTTTGCTTTT ACAGCAGTTG 
CAAGGTAACC TCCGAGGCTA TGGCATGGCA TCCCCAGGCC TCGCAGCCCC 
CAGCCTCACA CCCCCACAAC TGGCCACTCC AAATTTGCAA CAGTTGTTTC 
CCCAGGCCAC TCGCCAGTCC TTGCTGGGAC CTCCTCCTGT TGGGGTCCCC 
ATGAACCCTT CCCAGTTCAA CCTTTCAGGA CGGAACCCCC AGAAACAGGC 
CCGGACCTCC TCCTCTACCA CCCCCAATCG AAAGGATTCT TCTTCTCAGA 
CAATGCCTGT GGAAGACAAG TCAGACCCCC CAGAGGGGTC TGAGGAAGCC 
GCAGAGCCCC GGATGGACAC ACCAGAAGAC CAAGATTTAC CGCCCTGCCC 
AGAGGACATC GCCAAGGAAA AACGCACTCC AGCACCTGAG CCTGAGCCTT 
GTGAGGCGTC CGAGCTGCCA GCAAAGAGAT TGAGGAGCTC AGAAGAGCCC 
ACAGAGAAGG AACCTCCAGG GCAGTTACAG GTGAAGGCCC AGCCGCAGGC 
CCGGATGACA GTACCGAAAC AGACACAGAC ACCAGACCTG CTGCCTGAGG 
CCCTGGAAGC CCAAGTGCTG CCACGATTCC AGCCACGGGT CCTGCAGGTC 
CAGGCCCAGG TGCAGTCACA GACTCAGCCG CGGATACCAT CCACAGACAC 
CCAGGTGCAG CCAAAGCTGC AGAAGCAGGC GCAAACACAG ACCTCTCCAG 
AGCACTTAGT GCTGCAACAG AAGCAGGTGC AGGCACAGCT GCAGCAGGAG 
GCAGAGCCAC AGAAGCAGGT GCAGCCACAG GTACAGCCAC AGGCACATTC 
ACAGGGCCCA AGGCAGGTGC AGCTGCAGCA GGAGGCAGAG CCGCTGAAGC 
AGGTGCAGCC ACAGGTGCAG CCCCAGGCAC ATTCACAGCC CCCAAGGCAG 
GTGCAGCTGC AGCTGCAGAA GCAGGTCCAG ACACAGACAT ATCCACAGGT 
CCACACACAG GCACAGCCAA GCGTCC AGCC ACAGGAGCAT CCTCCAGCGC 
AGGTGTCAGT ACAGCCACCA GAGCAGACCC ATGAGCAGCC TCACACCCAG 
CCGCAGGTGT CGTTGCTGGC TCCAGAGCAA ACACCAGTTG TGGTTCATGT 
CTGCGGGCTG GAGATGCCAC CTGATGCAGT AGAAGCTGGT GGAGGCATGG 
AAAAGACCTT GCCAG AGCCT GTGGGCACCC AAGTCAGCAT GGAAGAGATT 
CAGAATGAGT CGGCCTGTGG CCTAGATGTG GGAGAATGTG AAAACAGAGC 
GAGAGAGATG CCAGGGGTAT GGGGCGCCGG GGGCTCCCTG AAGGTCACCA 
TTCTGCAGAG CAGTGACAGC CGGGCCTTTA GCACTGTACC CCTGACACCT 
GTCCCCCGCC CCAGTGACTC CGTCTCCTCC ACCCCTGCGG CTACCAGCAC 
TCCCTCTAAG CAGGCCCTCC AGTTCTTCTG CTACATCTGC AAGGCCAGCT 
GCTCCAGCCA GCAGGAGTTC CAGGACCACA TGTCGGAGCC TCAGCACCAG 
CAGCGGCTAG GGGAGATCCA GCACATGAGC CAAGCCTGCC TCCTGTCCCT 
GCTGCCCGTG CCCCGGGACG TCCTGGAGAC AGAGGATGAG GAGCCTCCAC 
CAAGGCGCTG GTGCAAC ACC TGCCAGCTCT ACTACATGGG GGACCTGATC 
CAACACCGCA GGACACAGGA CCACAAGATT GCCAAACAAT CCTTGCGACC 
CTTCTGCACC GTTTGCAACC GCTACTTCAA AACCCCTCGC AAGTTTGTGG 
AGCACGTGAA GTCCCAGGGG CATAAGGACA AAGCCAAGGA GCTGAAGTCG 
CTTGAGAAAG AAATTGCTGG CCAAGATGAG GACCACTTCA TTACAGTGGA 
CGCTGTGGGT TGCTTCGAGG GTGATGAAGA AGAGGAAGAG GATGATGAGG 
ATGAAGAAGA GATCGAGGTT GAGGAGGAAC TCTGCAAGCA GGTGAGGTCC 
AGAGATATAT CCAGAGAGGA GTGGAAGGGC TCGG AGACCT ACAGCCCCAA 
TACTG CATAT GGTGTGGACT TCCTGGTGCC CGTGATGGGC T AT ATCTG CC 
GCATCTGCCA CAAGTTCTAT CACAGCAACT CAGGGGCACA GCTCTCGCAC 
TGCAAGTCCC TGGGCCACTT TGAGAACCTG CAGAAATACA AGGCGGCCAA 
GAACCCC AGC CCCACCACCC GACCTGTG AG CCGCCGGTGC GCAATCAACG 
CCCGGAACGC TTTGACAGCC CTGTTCACCT CCAGCGGCCG CCCACCCTCC 
CAGCCCAACA CCCAGGACAA AACACCCAGC AAGGTGACGG CTCGACCCTC 
CCAGCCCCCA CTACCTCGGC GCTCAACCCG CCTCAAAACC TGATAGAGGG 
ACCTCCCTGT CCCTGGCCTG CCTGGGTCCA G ATCTG CT AA TGCTTTTTAG 
GAGTCTGCCT GGAAACTTTG ACATGGTTCA TGTTTTTACT CAAAATCCAA 
TAAAACAAGG TAGTTTGGCT GTGCAAAAAA AAAAAAAAAA AAAAAAAAAA 
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Exon 6 minus transcript (O 
TGGGGGCTGC GGG GCCGGCC CATCCGTGGG GGCGACTTGA GCGTTGAGGG ^ 
CGCGCGGGGA GGCGAGCCAC CATGTTCAGC CAGCAGCAGC AGCAGCTCCA f~\ 
GCAACAGCAG CAGCAGCTCC AGCAGTTACA GCAGCAGCAG CTCCAGCAGC 
AGCAATTGCA GCAGCAGCAG TTACTGCAGC TCC AGCAGCT GCTCCAGCAG 
TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAGG GCAC C AACTCAGCCT 
CCCTCCTCAA CGGCTCCATG CTGCAGAGAG CTTTGCTTTT ACAGCAGTTG 
CAAGGACTGG ACCAGTTTGC AATGCCACCA GCCACGTATG ACACTGCCGG 
TCTCACCATG CCCACAGCAA C ACTGGGTAA CCTCCGAGGC TATGGCATGG 
CATCCCCAGG CCTCGCAGCC CCCAGCCTCA CACCCCCACA ACTGGCCACT 
CCAAATTTGC AACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCT TGCT GGG 
ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 
GACGGAACCC CCAGAAACAG GCCCGGACCT CCTCCTCTAC CACCCCCAAT 
CGAAAGACAA TGCCTGTGGA AGACAAGTCA GACCCCCCAG AGGGGTCTGA 
GGAAGCCGCA GAGCCCCGGA TGGACACACC AGAAGACCAA GATTTACCGC 
CCTGCCCAGA GGACATCGCC AAGGAAAAAC GCACTCCAGC ACCTGAGCCT 
GAGCCTTGTG AGG CGTCCGA GCTGCCAGCA AAGAGATTGA GGAGCTCAGA 
AGAGCCCACA GAGAAGGAAC CTCCAGGGCA GTTACAGGTG AAGGCCCAGC 
CGCAGGCCCG GATGACAGTA CCGAAACAGA CACAGACACC AGACCTGCTG 
CCTGAGGCCC TGGAAGCCCA AGTGCTGCCA CGATTCCAGC CACGGGTCCT 
GCAGGTCCAG GCCCAGGTGC AGTCACAGAC TCAGCCGCGG ATACCATCC A 
CAGACACCCA GGTGCAGCCA AAGCTGCAGA AG CAGGCGC A AACACAGACC 
TCTCCAGAGC ACTTAGTGCT GCAACAGAAG CAGGTGCAGC CACAGCTGCA 
GCAGGAGGCA GAGCCAC AGA AGCAGGTGC A GCCACAGGTA CAGCCACAGG 
CACATTCACA GGGCCCAAGG CAGGTGCAGC TGCAGCAGGA GGCAGAGCCG 
CTGAAGCAGG TGCAGCCACA GGTGCAGCCC CAGGCACATT CACAGCCCCC 
AAGGCAGGTG CAGCTGCAGC TGCAGAAGCA GGTCCAGACA CAGACATATC 
CACAGGTCCA CACACAGGCA CAGCCAAGCG TCCAGCCACA GGAGCATCCT 
CCAGCGCAGG TGTCAGTACA GCCACCAGAG CAGACCCATG AGCAGCCTCA 
CACCCAGCCG CAGGTGTCGT TGCTGGCTCC AGAGCAAACA CCAGTTGTGG 
TTCATGTCTG CGGGCTGGAG ATGCCACCTG ATGCAGTAGA AGCTGGTGGA 
GGCATGGAAA AGACCTTGCC AGAGCCTGTG GGCACCCAAG TCAGCATGGA 
AG AGATTCAG AATGAGTCGG CCTGTGGCCT AGATGTGGGA GAATGTGAAA 
ACAGAGCGAG AGAGATGCCA GGGGTATGGG GCCCCGGGGG CTCCCTGAAG 
GTCACCATTC TGCAGAGCAG TGACAGCCGG GCCTTTAGCA CTGTACCCCT 
GACACCTGTC CCCCGCCCCA GTGACTCCGT CTCCTCCACC CCTGCGGCTA 
CCAGCACTCC CTCTAAGCAG GCCCTCCAGT TCTTCTGCTA CATCTGCAAG 
GCCAGCTGCT CCAGCCAGCA GGAGTTCCAG GACCACATGT CGGAGCCTC A 
GCACCAGCAG CGGCTAGGGG AGATCCAGCA CATGAGCCAA GCCTGCCTCC 
TGTCCCTGCT GCCCGTGCCC CGGGACGTCC TGGAGACAGA GGATGAGGAG 
CCTCCACCAA GGCGCTGGTG CAACACCTGC CAGCTCTACT ACATGGGGGA 
CCTG ATCCAA CACCGCAGG A CACAGGACCA CAAGATTGCC AAACAATCCT 
TGCGACCCTT CTGCACCGTT TGCAACCGCT ACTTCAAAAC CCCTCGCAAG 
TTTGTGGAGC ACGTGAAGTC CCAGGGGCAT AAGGACAAAG CCAAGGAGCT 
GAAGTCGCTT GAGAAAGAAA TTGCTGGCCA AGATGAGGAC CACTTCATTA 
CAGTGGACGC TGTGGGTTGC TTCGAGGGTG ATGAAGAAGA GGAAGAGGAT 
GATGAGGATG AAGAAGAGAT CGAGGTTGAG GAGGAACTCT GCAAGCAGGT 
GAGGTCCAGA GATATATCCA GAGAGGAGTG GAAG GGCTCG GAGACCTACA 
GCCCCAATAC TGCATATGGT GTGGACTTCC TGGTGCCCGT GATGGGCTAT 
ATCTGCCGCA TCTGCCACAA GTTCTATCAC AGCAACTCAG GGGCACAGCT 
CTCCCACTGC AAGTCCCTGG GCCACTTTGA GAACCTGCAG AAATACAAGG 
CGGCCAAGAA CCCCAGCCCC ACCACCCGAC CTGTGAGCCG CCGGTGCGCA 
ATCAACGCCC GGAACGCTTT GACAGCCCTG TTCACCTCCA GCGGCCGCCC 
ACCCTCCCAG CCCAACACCC AGGACAAAAC ACCCAGCAAG GTGACGGCTC 
GACCCTCCCA GCCCCCACTA CCTCGGCGCTCAACCCGCCT CAAAACCTGA 
TAGAGGGACC TCCCTGTCCC TGGCCTGCCT GGGTCCAGAT CTGCTAATGC 
TTTTTAGGAG TCTGCCTGGA AACTTTGACA TGGTTCATGT TTTTACTCAA 
AATCCAATAA AACAAGGTAG TTTGGCTGTG CAAAAAAAAA AAAAAAAAAA 
AAAAAAA 
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Exon 8 minus variant 1 CD 
TGGGGGCTOC GGGGCCGGCC CATCCGTGGG GGCGACTTGA GCGTTGAGGG 
CGCGCGGGGA GGCGAGCCAC CATGTTCAGC CAGCAGCAGC AGCAGCTCCA 
GGAACAGCAG CAGCAGCTCC AGCAGTTACA GCAGCAGCAG CTCCAGCAGC 
AGCAATTGCA GCAGCAGCAG TTACTGCAGC TCCAGCAGCT GCTCCAGCAG 
TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAGGGCACC AACTCAGCCT 
CCCTCCTC AA CGGCTCCATG CTGCAGAGAG CTTTGCTTTT ACAGCAGTTG 
CAAGGACTGG ACCAGTTTGC AATGCCACCA GCCACGTATG ACACTGCCGG 
TCTCACCATG CCCACAGCAA CACTGGGTAA CCTCCGAGGC TATGGCATGG 
CATCCCCAGG CCTCGCAGCC CCCAGCCTCA CACCCCCACA ACTGGCCACT 
CCAAATTTGC AACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCTTGCTGGG 
ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 
GACGGAACCC CCAGAAACAG GCCCGGACCT CCTCCTCTAC CACCCCCAAT 
CGAAAGGATT CTTCTTCTCA GACAATGCCT GTGGAAGACA AGTCAGACCC 
CCCAGAGGGG TCTGAGGAAG CCGCAGAGCC CCGGATGGAC ACACCAGAAG 
ACCAAGATTT ACCGCCCTGC CCAGAGGACA TCGCCAAGGA AAAACGCACT 
CCAGCACCTG AGCCTGAGCC TTGTGAGGCG TCCGAGCTGC CAGCAAAGAG 
ATTGAGGAGC TCAGAAGAGC CCACAGAGAA G G AACCTCCA GGGCAGTTAC 
AGGTGAAGGC CCAGCCGCAG GCCCGGATGA CAGTACCGAA ACAGACACAG 
ACACCAGACC TGCTGCCTGA GGCCCTGGAA GCCCAAGTGC TGCCACGATT 
CCAGCCACGG GTCCTGCAGG TCCAGGCCCA GGTGCAGTCA CAGACTCAGC 
CGCGGATACC ATCCACAGAC ACCCAGGTGC AGCCAAAGCT GCAGAAGCAG 
GCGCAAACAC AGACCTCTCC AGAGCACTTA GTGCTGCAAC AGAAGCAGGT 
GCAGCCACAG CTGCAGCAGG AGGCAGAGCC ACAGAAGCAG GTGCAGCCAC 
AGGTACAGCC ACAGGCACAT TCACAGGGCC CAAGGCAGGT GCAGCTGCAG 
CAGGAGGCAG AGCCGCTGAA GCAGGTGCAG ACAG GTCCACACAC AGGCA 
CAGCC AAGCGTCCAG 

CCACAGGAGC ATCCTCCAGC GCAGGTGTCA GTACAGCCAC CAGAGCAGAC 
CCATGAGCAG CCTCACACCC AGCCGCAGGT GTCGTTGCTG GCTCCAGAGC 
AAACACCAGT TGTGGTTCAT GTCTGCGGGC TGGAGATGCC ACCTGATGCA 
GTAGAAGCTG GTGGAGGCAT GGAAAAGACC TTGCCAGAGC CTGTGGGCAC 
CCAAGTCAG C ATGGAAGAGA TTCAGAATGA GTCGGCCTGT GGCCTAGATG 
TGGGAGAATG TGAAAACAGA GCGAGAGAGA TGCCAGGGGT ATGGGGCGCC 
GGGGGCTCCC TGAAGGTCAC CATTCTGCAG AGCAGTGACA GCCGGGCCTT 
TAGCACTGTA CCCCTGACAC CTGTCCCCCG CCCCAGTGAC TCCGTCTCCT 
CCACCCCTGC GGCTACCAGC ACTCCCTCTA AGCAGGCCCT CCAGTTCTTC 
TGCTACATCT GCAAGGCCAG CTGCTCCAGC CAGCAGGAGT TCCAGGACCA 
CATGTCGGAG CCTCAGCACC AGCAGCGGCT AGGGGAGATC CAGCACATGA 
GCCAAGCCTG CCTCCTGTCC CTGCTGCCCG TGCCCCGGGA CGTCCTGGAG 
ACAGAGGATG AGGAGCCTCC ACCAAGGCGC TGGTGCAACA CCTGCCAGCT 
CTACTACATG GGGGACCTGA TCCAACACCG CAGGACACAG GACCACAAGA 
TTGCCAAACA ATCCTTGCGA CCCTTCTGCA CCGTTTGCAA CCGCTACTTC 
AAAACCCCTC GCAAGTTTGT GGAGCACGTG AAGTCCCAGG GGCATAAGGA 
CAAAGCCAAG GAGCTGAAGT CGCTTGAGAA AGAAATTGCT GGCCAAGATG 
AGGACCACTT CATTACAGTG GACGCTGTGG GTTGCTTCGA GGGTGATGAA 
GAAGAGGAAG AGGATGATGA GGATGAAGAA GAGATCGAGG TTGAGGAGGA 
ACTCTGCAAG CAGGTGAGGT CCAGAGATAT ATCCAGAGAG GAGTGGAAGG 
GCTCGGAGAC CTACAGCCCC AATACTGCAT ATGGTGTGGA CTTCCTGGTG 
CCCGTGATGG GCTATATCTG CCGCATCTGC CACAAGTTCT ATCACAGCAA 
CTCAGGGGCA CAGCTCTCCC ACTGCAAGTC CCTGGGCCAC TTTGAGAACC 
TGCAGAAATA CAAGGCGGCC AAGAACCCCA GCCCCACCAC CCGACCTGTG 
AGCCGCCGGT GCGCAATCAA CGCCCGGAAC G CTTTG AC AG CCCTGTTCAC 
CTCCAGCGGC CGCCCACCCT CCCAGCCCAA CACCCAGGAC AAAACACCCA 
GCAAGGTGAC GGCTCGACCC TCCCAGCCCC CACTACCTCG GCGCTCAACC 
CGCCTCAAAA CCTGATAGAG GGACCTCCCT GTCCCTGGCC TGCCTGGGTC 
CAG ATCTGC T AATGCTTTTT AGGAGTCTGC CTGGAAACTT TGACATGGTT 
CATGTTTTTA CTCAAAATCC AATAAAACAA GGTAGTTTGG CTGTGCAAAA 
AAAAAAAAAA AAA AAAAAAA AA 
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Exon 8 minus variant 2 

TGGGGGCTGC GGGGCCGGCC CATCCGTGGG GGCGACTTGA GCGTTGAGGG 
CGCGCGGGGA GGCGAGCCAC CATGTTCAGC CAGCAGCAGC AGCAGCTCCA 
GCAACAGCAG CAGCAGCTCC AGCAGTTACA GCAGCAGCAG CTCCAGCAGC 
AGCAATTGCA GCAGCAGCAG TTACTGCAGC TCCAGCAGCT GCTCC AG CAG 
TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAGGGCACC AACTC AG CCT 
CCCTCCTCAA CGGCTCCATG CTGCAGAGAG CTTTGCTTTT ACAGCAGTTG 
CAAGGACTGG ACCAGTTTGC AATGCCACC A GCCACGTATG ACACTGCCGG 
TCTCACCATG CCCACAGCAA CACTGGGTAA CCTCCGAGGC TATGGCATGG 
CATCCCCAGG CCTCGCAGCC CCCAGCCTCA CACCCCCACA ACTGGCCACT 
CCAAATTTGC AACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCTTGCTGGG 
ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 
GACGGAACCC CCAGAAACAG GCCCGGACCT CCTCCTCTAC CACCCCCAAT 
CGAAAGGATT CTTCTTCTCA GACAATGCCT GTGGAAGACA AGTCAGACCC 
CCCAGAGGGG TCTG AGGAAG CCGCAGAGCC CCGG ATGGAC ACACCAGAAG 
ACCAAGATTT ACCGCCCTGC CCAGAGGACA TCGCCAAGGA AAAACGCACT 
CCAGCACCTG AGCCTGAGCC TTGTGAGGCG TCCGAGCTGC CAGCAAAGAG 
ATTGAGGAGC TCAGAAGAGC CCACAGAGAA GGAACCTCCA GGGCAGTTAC 
AGGTGAAGGC CCAGCCGCAG GCCCGGATGA CAGTACCGAA ACAGACACAG 
ACACCAGACC TGCTGCCTGA GGCCCTGGAA GCCCAAGTGC TGCCACGATT 
CCAGCCACGG GTCCTGCAGG TCCAGGCCCA GGTGCAGTCA CAGACTCAGC 
CGCGGATACC ATCCACAGAC ACCCAGGTGC AGCCAAAGCT GCAGAAGCAG 
GCGCAAACAC AGACCTCTCC AGAGCACTTA GTGCTGCAAC AGAAGCAGGT 
GCAGCCACAG CTGCAGCAGG AGGCAGAGCC ACAGAAGCAG GTGCAGCCAC 
AGGTCCACAC ACAGGCACAG CCAAGCGTCC AGCCACAGGA GCATCCTCCA 
GCGCAGGTGT CAGTACAGCC ACCAGAGCAG ACCCATGAGC AGCCTCACAC 
CCAGCCGCAG GTGTCGTTGC TGGCTCCAGA GCAAACACCA GTTGTGGTTC 
ATGTCTGCGG GCTGGAGATG CCACCTGATG CAGTAGAAGC TGGTGGAGGC 
ATGGAAAAGA CCTTGCCAGA GCCTGTGGGC ACCCAAGTCA GCATGGAAGA 
GATTCAGAAT GAGTCGGCCT GTGGCCTAGA TGTGGGAGAA TGTGAAAACA 
GAGCGAGAGA GATGCCAGGG GTATGGGGCG CCGGGGGCTC CCTGAAGGTC 
ACCATTCTGC AGAGCA GTGA CAGCCGGGCC TTTAGCACTG TACCCCTGAC 
ACCTGTCCCC CGCCCCAGTG ACTCCGTCTC CTCCACCCCT GCGGCTACCA 
GCACTCCCTC TAAGCAGGCC CTCC AGTTCT TCTGCTACAT CTGCAAGGCC 
AGCTGCTCCA GCCAGCAGGA GTTCCAGGAC CACATGTCGG AGCCTCAGCA 
CCAGCAGCGG CTAGGGGAGA TCCAGCACAT GAGCCAAGCC TGCCTCCTGT 
CCCTGCTGCC CGTGCCCCGG GACGTCCTGG AGACAGAGGA TGAGGAGCCT 
CCACCAAGGC GCTGGTGCAA CACCTGCCAG CTCTACTACA TGGGGGACCT 
GATCCAACAC CGCAGGACAC AGGACCACAA GATTGCCAAA CAATCCTTGC 
GACCCTTCTG CACCGTTTGC AACCGCTACT TCAAAACCCC TCGCAAGTTT 
GTGGAGCACG TGAAGTCCCA GGGGCATAAG GACAAAGCCA AGGAGCTGAA 
GTCGCTTGAG AAAGAAATTG CXGGCCAAGA TGAGGACCAC TTCATTACAG 
TGGACGCTGT G GGTTGCTTC GAGGGTGATG AAGAAGAGGA AGAGGATGAT 
GAGGATGAAG AAGAGATCGA GGTTGAGGAG GAACTCTGCA AGCAGGTGAG 
GTCCAGAGAT ATATCCAGAG AGGAGTGGAA GGGCTCGGAG ACCTACAGCC 
CCAATACTG C ATATGGTGTG GACTTCCTGG TGCCCGTGAT GGGCTATATC 
TGCCGCATCT G CCAC AAGTT CTATCACAGC AACTCAGGGG CACAGCTCTC 
CCACTGCAAG TCCCTGGGCC ACTTTG AGAA CCTGCAGAAA TACAAGGCGG 
CCAAGAACCC CAGCCCCACC ACCCGACCTG TGAGCCGCCG GTGCGCAATC 
AACGCCCGGA ACGCTTTGAC AGCCCTGTTC ACCTCCAGCG GCCGCCCACC 
CTCCCAGCCC AACACCCAGG ACAAAACACC CAGCAAGGTG ACGGCTCGAC 
CCTCCCAGCC CCCACTACCT CGGCGCTCAA CCCGCCTCAA AACCTGATAG 
AGGGACCTCC CTGTCCCTGG CCTGCCTGGG TCCAGATCTG CTAATGCTTT 
TTAGGAGTCT GCCTGGAAAC TTTGACATGG TTCATGTTTT TACTCAAAAT 
CCAATAAAAC AAGGTAGTTT GGCTGTGCAA AAAAAAAAAA AAAAAAAAAAAAAA 
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Exon 8 minus variant 3 

TGGGGGCTGC GGGGCCGGCC CATCCGTGGG GGCGACTTGA GCGTTGAGGG ^1 

CGCGCGGGGA GGCGAGCC AC CATGTTCAGC CAGCAGCAGC AGCAGCTCCA 

GCAACAGCAG CAGC AGCTGC AGCAGTTACA GCAGCAGCAG CTCCAGCAGC 

AGC AATTGCA GCAGCAGCAG TTACTGCAGC TCCAGCAGCT GCTCCAGCAG 

TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 

GCAGCAGCCA C AGC AGCCG C TTCTG AATCT CCAGGGCACC AACTCAGCCT 

CCCTCCTCAA CGGCTCC ATG CTGCAGAGAG CTTTGCTTTT ACAGCAGTTG 

CAAGGACTGG ACCAGTTTGC AATGCCACCA GCCACGTATG ACACTGCCGG 

TCTCACCATG CCCACAGCA A CACTGGGTAA CCTCCGAGGC TATGGCATGG 

CATCCCCAGG CCTCGCAGCC CCCAGCCTCA CACCCCCACA ACTGGCCACT 

CCAAATTTGC AACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCTTGCTGGG 

ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 

GACGGAACCC CCAGAAACAG GCCCGGACCT CCTCCTCTAC CACCCCCAAT 

CGAAAGGATT CTTCTTCTCA GACAATGCCT GTGGAAGACA AGTCAGACCC 

CCCAGAGGGG TCTGAGGAAG CCGCAGAGCC CCGGATGGAC ACACCAGAAG 

ACCAAGATTT ACCGCCCTGC CCAGAGGACA TCGCCAAGGA AAAACGCACT 

CCAGCACCTG AGCCTGAGCC TTGTG AGGCG TCCGAGCTGC CAGCAAAGAG 

ATTGAGGAGC TCAGAAGAGC CCACAGAGAA GGAACCTCCA GGGCAGTTAC 

AGGTGAAGGC CCAGCCGCAG GCCCGGATGA CAGTACCGAA ACAGACACAG 

ACACCAGACC TGCTGCCTGA GGCCCTGGAA GCCCAAGTGC TGCCACGATT 

CCAGCCACGG GTCCTGCAGG TCCAGGCCTC CACAGGTCCA CACACAGGCA 

CAGCCAAGCG TCCAGCCACA GGAGCATCCT CCAGCGCAGG TGTCAGTACA 

GCCACCAGAG CAGACCCATG AGCAGCCTCA CACCCAGCCG CAGGTGTCGT 

TGCTGGCTCC AGAGCAAACA CCAGTTGTGG TTCATGTCTG CGGGCTGGAG 

ATGCCACCTG ATGCAGTAGA AGCTGGTGGA GGCATGGAAA AGACCTTGCC 

AGAGCCTGTG GGCACCCAAG TCAGCATGGA AGAGATTCAG AATGAGTCGG 

CCTGTGGCCT AGATGTGGGA GAATGTG AAA ACAGAGCGAG AGAGATGCCA 

GGGGTATGGG GCGCCGGGGG CTCCCTGAAG GTCACCATTC TGCAGAGCAG 

TGACAGCCGG GCCTTTAGCA CTGTACCCCT GACACCTGTC CCCCGCCCCA 

GTGACTCCGT CTCCTCCACC CCTGCGGCTA CCAGCACTCC CTCTAAGCAG 

GCCCTCCAGT TCTTCTGCTA CATCTGCAAG GCCAGCTGCT CCAGCCAGCA 

GGAGTTCCAG GACCACATGT CGGAGCCTCA G C ACCAGC AG CGGCTAGGGG 

AGATCCAGCA CATGAGCCAA GCCTGCCTCC TGTCCCTGCT GCCCGTGCCC 

CGGGACGTCC TGGAGACAGA GGATGAGGAG CCTCCACCAA GGCGCTGGTG 

CAACACCTGC CAGCTCTACT ACATGGGGGA CCTGATCCAA CACCGCAGGA 

CACAGGACCA CAAGATTGCC AAACAATCCT TGCGACCCTT CTGCACCGTT 

TGCAACCGCT ACTTCAAAAC CCCTCGCAAG TTTGTGGAGC ACGTGAAGTC 

CCAGGGGCAT AAGGACAAAG CCAAGGAGCT GAAGTCGCTT GAGAAAGAAA 

TTGCTGGCCA AGATGAGGAC CACTTCATTA CAGTGGACGC TGTGGGTTGC 

TTCGAGGGTG ATGAAGAAGA GGAAGAGGAT GATGAGGATG AAGAAGAGAT 

CGAGGTTGAG GAGGAACTCT GCAAGCAGGT GAGGTCCAGA GATATATCCA 

GAGAGGAGTG GAAGGGCTCG GAGACCTACA GCCCCAATAC TGCATATGGT 

GTGGACTTCC TGGTGCCCGT GATGG GCTAT ATCTGCCGCA TCTGCCACAA 

GTTC TATCA C AGCAACTCAG GGGCACAGCT CTCCCACTGC AAGTCCCTGG 

GCCACTTTGA GAACCTGCAG AAATACAAGG CGGCCAAGAA CCCCAGCCCC 

ACCACCCGAC CTGTGAGCCG CCGGTGCGCA ATCAACGCCC GGAACGCTTT 

GACAGCCCTG TTCACCTCCA GCGGCCGCCC ACCCTCCCAG CCCAACACCC 

AGGACAAAAC ACCCAGCAAG GTGACGGCTC GACCCTCCCA GCCCCCACTA 

CCTCGGCGCT CAACCCGCCT CAAAACCTGA TAGAGGGACC TCCCTGTCCC 

TG GCCT GCCT GGGTCCAGAT CTGCTA ATGC TTTTTAGGAG TCTGCCTGGA 

AACTTTGACA TGGTTCATGT TTTTACTCAA AATCCAATAA AACAAGGTAG 

TTTGGCTGTG CAAAAAAAAA AAAAAAAAAA AAAAAAA 
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f * Exon 14 minus transcript . ^ 

TGGGGGCTGC GGGGCCGGCC CATCCGTGGG GGCGACTTGA GCGTTG AGGG > ■ 

CGCGCGGGGA GGGGAGCCAC CATGTTCAGC CAGCAGCAGC AGCAGCTCCA S~\ 
GCAACAGCAG CAGCAGCTCC AGCAGTTACA GC AGCAGC AG CTCCAGCAGC ^ J 

AGCAATTGCA GCAGCAGCAG TTACTGCAGC TCCAGCAGCT GCTCCAGCAG 
TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAGGGCACC AACTCAGCCT 
CCCTCCTCAA CGGCTCCATG CTGCAGAGAG CTTTGCTTTT ACAGCAGTTG 
CAAGGACTGG ACCAGTTTGC AATGCCACCA GCCACGTATG ACACTGCCGG 
TCTCACCATG CCCACAGCAA CACTGGGTAA CCTCCGAGGC TATGGCATGG 
CATCCCCAGG CCTCGCAGCC CCCAGCCTC A CACCCCCACA ACTGGCCACT 
CCAAATTTGC AACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCTTGCTGGG 
ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 
GACGGAACCC CCAGAAACAG GCCCGGACCT CCTCCTCTAC CACCCCCAAT 
CGAAAGGATT CTTCTTCTCA GACAATGCCT GTGGAAGACA AGTCAGACCC 
CCCAGAGGGG TCTGAGGAAG CCGCAGAGCC CCGGATGGAC ACACCAGAAG 
ACCAAGATTT ACCGCCCTGC CCAGAGGACA TCGCCAAGGA AAAACGCACT 
CCAGCACCTG AGCCTGAGCC TTGTGAGGCG TCCGAGCTGC CAGCAAAGAG 
ATTGAGGAGC TCAGAAGAGC CCACAGAGAA GGAACCTCCA GGGCAGTTAC 
AGGTGAAGGC CCAGCCGCAG GCCCGGATGA CAGTACCGAA ACAGACACAG 
ACACCAGACC TGCTGCCTGA GGCCCTGGAA GCCCAAGTGC TGCCACGATT 
CCAGCCACGG GTCCTGCAGG TCCAGGCCCA GGTGCAGTCA CAGACTCAGC 
GGCGGATACC ATCCACAGAC ACCCAGGTGC AGCCAAAGCT GCAGAAGCAG 
GCGCAAACAC AGACCTCTCC AGAGCACTTA GTGCTGCAAC AGAAGCAGGT 
GCAGCCACAG CTGCAGCAGG AGGCAGAGCC ACAGAAGCAG GTGCAGCCAC 
AGGTACAGCC ACAGGCACAT TCACAGGGCC C AAGGCAGGT GCAGCTGCAG 
CAGGAGGCAG AGCCGCTGAA GCAGGTGCAG CCACAGGTGC AGCCCCAGGC 
ACATTCACAG CCCCCAAGGC AGGTGCAGCT GCAGCTGCAG AAGCAGGTCC 
AGACACAGAC ATATCCACAG GTCCACACAC AGGCACAGCC AAGCGTCCAG 
CCACAGGAGC ATCCTCCAGC GCAGGTGTCA GTACAGCCAC CAGAGCAGAC 
CCATGAGCAG CCTCACACCC AGCCGCAGGT GTCGTTGCTG GCTCCAGAGC 
AAACACCAGT TGTGGTTCAT GTCTGCGGGC TGGAGATGCC ACCTGATGCA 
GTAGAAGCTG GTGGAGGCAT GGAAAAGACC TTGCCAG AGC CTGTGGGCAC 
CCAAGTCAGC ATGGAAGAGA TTCAGAATGA GTCGGCCTGT GGCCTAGATG 
TGGGAGAATG TGAAAACAGA GCGAGAGAGA TGCCAGGGGT ATGGGGCGCC 
GGGGGCTCCC TGAAGGTCAC CATTCTGCAG AGCAGTGACA GCCGGGCCTT 
TAGCACTGTA CCCCTGACAC CTGTCCCCCG CCCCAGTGAC TCCGTCTCCT 
CCACCCCTGC GGCTACCAGC ACTCCCTCTA AGCAGGCCCT CCAGTTCTTC 
TGCTACATCT GCAAGGCCAG CtGCTCCAGC CAGCAGGAGT TCCAGGACCA 
CATGTCGGAG CCTCAGCACC AGCAGCGGCT AGGGGAGATC CAGCACATGA 
GCCAAGCCTG CCTCCTGTCC CTGCTGCCCG TGCCCCGGGA CGTCCTGGAG 
ACAGAGG ATG AGGAGCGTCC ACCAAGGCGC TGGTGCAACA CCTGCCAGCT 
CTACTACATG GGGGACCTGA TCCAACACCG CAGGACACAG GACCACAAGA 
TTGCCAAACA ATCCTTGCGA CCCTTCTGCA CCGTTTGCAA CCGCTACTTC 
AAAACCCCTC GCAAGTTTGT GGAGCACGTG AAGTCCC AG G GGCATAAGGA 
CAAAGCCAAG GAGCTGAAGT CGCITGAGAA AGAAATTGCT GGCCAAGATG 
AGGACCACTT CATTACAGTG GACGCTGTGG GTTGCTTCGA'GGGTGATGAA 
GAAGAGGAAG AGGATGATGA GGATGAAGAA GAGATCGAGG TGAGGTCCAG 
AGATATATCC AGAGAGGAGT GGAAGGGCTC GGAGACCTAC AGCCCCAATA 
CTGCATATGG TGTGGACTTC CTGGTGCCCG TGATGGGCTA TATCTGCCGC 
ATCTGCCACA AGTTC TATC A CAGCAACTCA GGGGCAC AGC TCTCCCACTG 
CAAGTCCCTG G GCCACTTTG AGAACCTGCA GAAATACAAG GCGGCCAAGA 
L ACCCCAGCCC C ACCACCCGA CCTGTGAGCC GCCGGTGCGC AATCAACGCC 

i CGGAACGCTT TGACAGCCCT GTTCACCTCC AGCGGCCGCC CACCCTCCCA 

GCCCAACACC CAGGACAAAA CACCCAGCAA GGTGACGGCT CGACCCTCCC 
AGCCCCCACT ACCTCGGCGC TCAACCCGCC TCAAAACCTG ATAGAGGGAC 
^ CTCCCTGTCC CTGGCCTGCC TGGGTCCAGA TCTGCTAATG CTTTTT AGGA 

GTCTGCCTGG AAACTTTGAC ATGGTTCATG TTTTTACTCA AAATCCAATA 
AAACAAGGTA GTTTGGCTGT GCA AAAAAAA AAAAAAAAAA AAAAAAAA 
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•s 

\ Also to be protected are transcripts which lack combinations of the variable exons. For example'- )— { 

5 V | CD 
r-*7 Exon 14 and partial exon 6 minus variant 

/ TGGGGGCTGC GGGGCCGGCC CATCCGTGGG GGCG ACTTGA GCGTTGAGGG H- i 

/ CGCGCGGGGA GGCGAGCCAC CATGTTCAGC CAGCAGGAGC AGCAGCTCCA hfl 

i GCAACAGCAG CAGCAGCTCC AGCAGTTACA GCAGCAGCAG CTCCAGCAGC 

; AGCAATTGCA GCAGCAGCAG TTACTGCAGC TCCAGCAGCT GCTCCAGCAG 

TCCCCACCAC AGGCCCCGTT GCCCATGGCT GTCAGCCGGG GGCTCCCCCC 
GCAGCAGCCA CAGCAGCCGC TTCTGAATCT CCAGGGCACC AACTCAGCCT 
CCCTCCTCAA CGGCTCCATG CTGCAGAGAG CTTTGGTTTT ACAGCAGTTG 
CAAGGACTGG ACCAGTTTGC AATGCCACCA GCCACGTATG ACACTGCCGG 
TCTCACCATG CCCACAGCAA CACTGGGTAA CCTCCGAGGC TATGGCATGG 
CATCCCCAGG CCTCGCAGCC CCCAGCCTCA CACCCCCACA ACTGGCCACT 
CC AAATTTGC A ACAGTTCTT TCCCCAGGCC ACTCGCCAGT CCTTGCTGGG 
ACCTCCTCCT GTTGGGGTCC CCATGAACCC TTCCCAGTTC AACCTTTCAG 
GACGGAACCC CCAGAAACAG GCCCGGACCT CCTCCTCTAC CACCCCCAAT 
CGAAAGACAA TGCCTGTGGA AGACAAGTCA GACCCCCCAG AGGGGTCTGA 
GGA AGCCGCA GAGCCCCGGA TGGACACACC AGAAGACCAA GATTTACCGC 
CCTGCCCAGA GGACATCGCC AAGGAAAAAC GCACTCCAGC ACCTGAGCCT 
GAGCCTTGTG AGGCGTCCGA GCTGCCAGCA AAGAGATTGA GGAGCTCAGA 
AGAGCCCACA GAGAAGGAAC CTCCAGGGCA GTTACAGGTG AAGGCCCAGC 
CGCAGGCCCG GATGACAGTA CCGAAACAGA CACAGACACC AGACCTGCTG 
CCTGAGGCCC TGGAAGCCCA AGTGCTGCCA CGATTCCAGC CACGGGTCCT 
GCAGGTCCAG GCCCAGGTGC AGTCACAGAC TCAGCCGCGG ATACCATCCA 
CAGACACCCA GGTGCAGCCA AAGCTGCAGA AGCAGGCGCA AACACAGACC 
TCTCCAGAGC ACTTAGTGCT GCAACAGAAG CAGGTGCAGC CACAGCTGCA 
GCAGGAGGCA GAGCCACAGA AGCAGGTGCA GCCACAGGTA CAGCCACAGG 
CACATTCACA GGGCCCAAGG CAGGTGCAGC TGCAGCAGGA GGCAGAGCCG 
CTGAAGCAGG TGCAGCCACA GGTGCAGCCC CAGGCACATT CACAGCCCCC 
AAGGCAGGTG CAGCTGCAGC TGCAGAAGCA GGTCCAGACA CAGACATATC 
CACAGGTCCA CACACAGGCA CAGCCAAGCG TCCAGCCACA GGAGCATCCT 
CCAGCGCAGG TGTCAGTACA GCCACCAGAG C AGACCCATG AGCAGCCTCA 
CACCCAGCCG CAGGTGTCGT TGCTG GCTCC AGAGCAAACA CCAGTTGTGG 
TTCATGTCTG CGGGCTGGAG ATGCCACCTG ATGCAGTAGA AGCTGGTGGA 
GGCATGGAAA AGACCTTGCC AGAGCCTGTG GGC ACCCAAG TCAGCATGG A 
AGAGATTCAG AATGAGTCGG CCTGTGGCCT AGATGTGGGA GAATGTGAAA 
ACAGAGCGAG AGAGATGCCA GGGGTATGGG GCGCCGGGGG CTCCCTGAAG 
GTCACCATTC TGCAGAGCAG TGACAGCCGG GCCTTTAGCA CTGTACCCCT 
GACACCTGTC CCCCGCCCCA GTGACTCCGT CTCCTCCACC CCTGCGGCTA 
CCAGCACTCC CTCTAAGCAG GCCCTCCAGT TCTTCTGCTA CATCTGCAAG 
GCCAGCTGCT CCAGCCAGCA GGAGTTCCAG GACCACATGT CGGAGCCTCA 
GCACCAGCAG CGGCT AGGGG AGATCCAGCA CATGAGCCAA GCCTGCCTCC 
TGTCCCTGCT GCCCGTGCCC CGGGACGTCC TGGAGACAGA GGATGAGGAG 
CCTCC ACCAA GGCGCTGGTG CAACACCTGC CAGCTCTACT ACATGGGGGA 
CCTGATCCAA CACCGCAGGA CACAGGACCA CAAGATTGCC AAACAATCCT 
TGCGACCCTT CTGCACCGTT TGCAACCGCT ACTTCAAAAC CCCTCGCAAG 
TTTGTGGAGC ACGTGAAGTC CCAGGGGCAT AAGGACAAAG CCAAGGAGCT 
GAAGTCGCTT G A G AAAG AAA TTGCTGGCCA AGATGAGGAC CACTTCATTA 
CAGTGGACGC TGTGGGTTGC TTCGAGGCTG ATGAAGAAGA GGAAGAGGAT 
GATGAGGATG AAGAAGAGAT CGAGGTGAGG TCCAGAGATA TATCCAGAGA 
GGAGTGGAAG GGCTCGGAGA CCTACAGCCC CAATACTGCA TATGGTGTGG 
ACTTCCTGGT GCCCGTGATG GGCTATATCT GCCGCATCTG CCACAAGTTC 
TATCACAGCA ACTCAGGGGC ACAGCTCTCC CACTGCAAGT CCCTGGGCC A 
CTTTGAGAAC CTGCAGAAAT ACAAGGCGGC CAAGAACCCC AGCCCCACCA 
CCCGACCTGT GAGCCGCCGG TGCGCAATCA ACGCCCGGAA CGCTTTGACA 
: GCCCTGTTCA CCTCCAGCGG CCGCCCACCC TCCCAGCCCA ACACCCAGGA 

CAAAACACCC AGCAAGGTGA CGGCTCGACC CTCCCAGCCC CCACTACCTC 
GGCGCTCAAC CCGCCTCAAA ACCTGATAGA GGGACCTCCC TGTCCCTGGC 
CTGCCTGGGT CCAGATCTGC TAATGCTTTT TAGGAGTCTG CCTGGAAACT 
TTGACATGGT TCATGTTTTT ACTCAAAATC CAATAAAACA AGGTAGTTTG 
GCTGTGCAAA AAAAAAAAAA AAAAAAAAAA AAA 
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